PROCEEDINGS.  THIRD  LOUISVILLE  CONFERENCE 
ON  RATE-CONTROLLED  SPEECH 


NOVEMBER  4-5, 1975 


Digitized  by  the  Internet  Archive 

in  2012  with  funding  from 

Lyrasis  Members  and  Sloan  Foundation 


http://archive.org/details/proceedingsthird1975emer 


PROCEEDINGS 


THIRD  LOUISVILLE  CONFERENCE  ON 


RATE-CONTROLLED  SPEECH 


November  4-5,  1975 

Stouffer's  Louisville  Inn,  Louisville,  Kentucky 


Distributed  by  American  Foundation  for  the  Blind,  Inc. 
15  West  16th  Street,  New  York,  N.Y.   10011 


/J77 


TABLE  OF  CONTENTS 

FOREWORD 1 

CHAPTER 

1.  Calibration  Procedures  for  Time  -Compressed  Speech 13 

Dan  F.   Konkle,    Barry  A.    Freeman,    Donald  Rig 
Linda  L.    Riensche,    and  Daniel  S.    Beasley 

2.  Musical  Pitch  Shifting:    A  Novel  Application  of  Technology 31 

Useful  in  Rate  Control 

Richard  F.   Koch 

3.  Reaction  Time  in  Identifying  Time-Compressed  Words 37 

Lawson  H.    Hughes  and  Emerson  Foulke 

4.  The  Wichita  Studies  on  Comprehension  of  Rate-Controlled 59 

Speech 

Robert  L.   McCroskey  and  Nickola  W.    Nelson 

5.  Theories  of  the  Origins  of  Language 83 

Murray  S.    Miron 

6.  Observations  on  the  Measurement  of  Listening  Comprehension....    93 
and  Learning  Efficiency  as  a  Function  of  Time-Compressed 

Speech 

Phyllis  H.    Skinner  and  David  B.    Orr 

7„  The  Effects  of  Irrelevant  Concurrent  Psychomotor  Activity 103 

on  the  Ability  to  Comprehend  Compressed  Speech 
Joe  T.    Billman 

8.  Normal  Hearing  Children's  Intelligibility  of  Time-Compressed.  .  .   119 
Words 

Jane  Shoup,    Daniel  S.    Beasley,    Jean  E.    Maki,    and 
Fred  Bess 

9.  The  Performance  of  Children  who  Display  Auditory  Processing..     129 
Disorders  on  a  Time-Compressed  Speech  Discrimination  Task 

Walter  H.   Manning,    Kathleen  L.    Johnston, 
Daniel  S.    Beasely 


10.  Effects  of  Time  Compression  upon  Process  Variables   in 143 

Counseling  Dialog 

Reiko  Schwab 

11.  Temporal  Redundancy  and  the  Esthetics  of  Time-Compressed....     161 
Speech:    A  Conceptualization 

Theodore  L.    Glasser 

12.  Toward  a  Theory  of  Rate -Controlled  Speech 173 

Charles  L.   Meadows 

13.  Comments  on  the  Use  of  Rate-Controlled  Recordings  to 185 

Improve  Speech 

Thomas  G.    Sticht 

14.  Correlates  of  Successful  Speech  Compression  Use  by 219 

Blinded  Veterans 

W.    De  l'Aune,    C.    Lewis,    W.    Needham,    and  J.    Nelson 

15.  A  Study  of  the  Instructional  Potential  of  Compressed  Speech 231 

for  Postsecondary  Technical  Students 

Donna  L.    Wood 

16.  Recent  Army  Research  in  Compressed  Speech 243 

Joyce  L.    Shields 

17.  Can  Students   in  a  Self-Paced  Course  Save  Time  and  Earn 257 

Higher  Grades  Using  Time-Compressed  Speech? 
Sarah  H.    Short 

18.  The  Effect  of  Fixed  and  Learner  Selected  Rates  of  Compressed.  .  .    277 
Speech  in  an  Audio -Tutorial  Learning  Environment  on  the 
Achievement  of  College  Level  Students 

A.    James  Challis 

19.  A  Comparative  Investigation  of  Listening  Rate  Preference 293 

Employing  Two  Methods   of  Temporal  Alteration 

Herbert  A.    Leeper,    Jr.  ,    and  Norman  J.    Lass 

20.  Exposure  to  Time -Compressed  Speech:     Effect  on  Subjects' 303 

Listening  Rate  Preferences  and  Listening  Comprehension 

Skills 

Norman  J.    Lass,    Emerson  Foulke,    Ann  A.    Nester, 
and  Joanne  Comerci 

21.  Listeners'  Preferences  for  the  Rate  of  Presentation  of  Recorded.  .319 
Information 

S.    Joseph  Levine 


22.  The  Effect  of  Time-Compressing  an  Audiovisual  Instructional  ....  343 
Program  upon  Learning  and  Retention 

B.    Eugene  Koskey 

23.  The  Effect  of  Different  Levels  of  Audio  and  Video  Compression.  .  .  365 
upon  a  Televised  Demonstration  in  M  icrobiology 

MaryAnn  Blind 

24.  Time -Expanded  Speech:     Clinical  Applications  to  the  Diagnosis...  383 
of  Speech  Disorders 

Norman  J.    Lass,    Emerson  Foulke,    and  Rebecca  A. 
Supler 

25.  SSLR:    Simultaneous  Speeded  Listening  and  Reading 395 

A  Promising  Path  to  Remediation  of  Reading  Disabilities 
Shirley  N.    Winters 


FOREWORD 
Background 

The  University  of  Louisville  has  now  been  host  for  three  conferences 
of  individuals  concerned,    in  various  ways,    with  time -compressed  and  time- 
expanded  speech.      These  conferences  have  provided  the  opportunity  for 
representatives  of  such  diverse  disciplines  as  education,    engineering, 
linguistics,   manufacturing,    program  administration,    psychology,    and  speech 
science  to  learn  from  each  other  by  discussing  their  mutual   interests  in 
facilitating  a  variety  of  human  activities  by  exercising  the  control  over  the 
word  rate  of  recorded  speech  that  has  been  made  possible  and  practical  by 
recent  developments   in  electronic  technology. 

The  first  conference  was  held  in  1966,  and  the  second  in  1969.  Much 
of  the  significant  research  on  time -compressed  and  time-expanded  speech 
that  had  so  far  been  accomplished  was  reported  at  these  conferences,  and 
knowledge  of  this  research  was  disseminated  more  widely  by  means  of  the 
volumes  of  conference  proceedings  that  were  prepared.  These  volumes  have 
been  distributed  to  an  international  readership,  and  requests  for  them  are 
still  being  filled. 

A  significant  outgrowth  of  the  first  conference  was  the  establishment, 
at  the  University  of  Louisville,    of  the  Center  for  Rate-Controlled  Recordings. 
This  Center  is  a  unit  of  the  Perceptual  Alternatives  Laboratory.     At  the  time 


the  Center  was  established,    the  equipment  needed  for  the  time  compression 
and  time  expansion  of  recorded  speech  was  expensive,    difficult  to  operate, 
and  scarce.      Consequently,    one  of  its  original  objectives  was  to  provide  an 
inexpensive  source  for  rate-controlled  recorded  speech  of  satisfactory  quality, 
so  that  researchers  could  investigate  the  effect  of  altered  word  rates  on  aural 
communication,    and  educators  and  others  with  practical  interests   could  gain 
experience  with  its  application  in  a  variety  of  settings.      The  problems 
engendered  by  the  equipment  then  available  for  the  time  compression  and 
expansion  of  speech  have  largely  been  solved.     Speech  compressors  of  high 
quality  that  are  easy  to  operate  are  now  readily  available  and  relatively 
inexpensive.      We  expected  that  as   inexpensive  and  satisfactory  equipment 
became  available,    the  Center  would  no  longer  receive  orders  for  the 
preparation  of  rate -controlled  recordings.     The  increased  availability  of 
equipment  has  been  accompanied  by  a  renewed  interest  in  the  possibilities 
suggested  by  the  ability  to  control  the  word  rate  at  which  recorded  speech  is 
reproduced,    and  apparently,    many  investigators  prefer  to  gain  initial 
experience  with  rate-controlled  recorded  speech  prepared  at  the  Center 
before  purchasing  their  own  equipment.      In  any  case,    the  Center  continues  to 
receive  a  stream  of  requests  for  the  preparation  of  rate -controlled  recorded 
speech. 

Another  objective  of  the  Center  was  to  disseminate  information  con- 
cerning rate -controlled  recorded  speech.      This  has  been  accomplished  by 
the  organization  of  national  conferences,    the  distribution  of  conference  pro- 
ceedings and  other  research  reports,    and  the  publication  of  the  CRCR 


Newslette  r,    a  bi-monthly  newsletter  that  is  distributed  internationally  to 
persons  interested  in  the  production  and  use  of  time -compressed  and  time- 
expanded  speech. 

Those  who  attended  the  First  Louisville  Conference  shared  the  hope 
that  the  ability  to  control  the  word   rate  at  which  recorded  speech  is   repro- 
duced would  make  it  possible  to  improve  at  least  the  efficiency  and  perhaps 
the  effectiveness  of  aural  communication.      They   realized  that  the  ability  to 
compress  or  expand  the  time  needed  for  the   reproduction  of  recorded  speech 
would  be  of  little  practical' value  unless   some  solution  to  the  equipment 
problem  could  be  found.     However,    the  favorable  outcomes   of  several  experi- 
ments  and  demonstrations  had  been  well  publicized,    there  was  a  growing 
suspicion  that  time  compression  might  have  commercial  potential,    and  there 
was   reason  to  believe  that  developments  in  speech  compression  technology 
would  soon  solve  the  equipment  problem.      By  the  time  the  Second  Louisville 
Conference  was  held,    the  equipment  problem  had  been  eased  by  the  intro- 
duction of  an  improved  electromechanical  compressor.      This   compressor 
was   still  expensive  and  still  difficult  to  operate,    but  it  was    readily  available, 
it  could  generate  both  time -compressed  and  time -expanded  speech,    and  its 
signal  quality  was    relatively  high.      Though  it  was   still  not  practical  for  each 
listener  to  own  and  operate  his   personal  speech  compressor,    an  organization 
with  an  adequate  technical  staff  could  purchase  and  maintain  a  speech  com- 
pressor,   and  so  the  feasibility  of  a    center  that  would  prepare  time -compressed 
recorded  reading  matter  for  its   clients  was  given  serious  consideration  by 
several  agencies   serving  persons  who  find  it  advantageous   to   read     by 
listening. 


With  this  arrangement,    a  listener  could  take  advantage  of  time  com- 
pression to  save  some  of  the  time  spent  in  reading  by  listening,    and  if  he  had 
a  great  deal  of  listening  to  do,    the  savings  he  could  realize  might  be  worth- 
while.    However,    the  listener  would  still  not  be  able  to  choose  his  own  word 
rate,    and  to  change  that  word  rate  as  the  nature  of  the  reading  matter  and  his 
purpose  for  reading  changed. 

By  the  time  of  the  Second  Louisville  Conference,    there  had  been  sev- 
eral successful  demonstrations  of  the  use  of  a  computer  to  compress  or 
expand  the  time  needed  to  display  a  speech  signal.     The  practical  significance 
of  these  demonstrations  was  that  if  a  computer  could  be  used  for  compression 
and  expansion,    it  should  also  be  possible  to  design,    and  produce  at  a  rela- 
tively low  cost,    an  electronic  device  specifically  for  that  purpose.     As  a 
matter  of  fact,    the  development  of  such  devices  was  already  under  way,    and 
there  was  even  the  prospect  that  integrated  circuit  techniques  could  be  used 
to  bring  about  a  dramatic  reduction  in  both  the  size  and  the  cost  of  the  elec- 
tronic hardware  used  for  compression  and  expansion. 

At  the  close  of  the  Second  Louisville  Conference,    there  was  a  general 
agreement  that  a    third  conference  should  be  planned  in  a  year  or  two.     How- 
ever,   six  years  passed  before  another  conference  could  be  organized.     During 
that  period,    the  anticipated  developments   in  the  electronic  technology  required 
for  the  compression  and  expansion  of  speech  were  achieved.     However,    during 
the  same  period,    it  began  to  be  apparent  that  the  policies  which  guided  the 
decisions  of  the  federal  funding  agencies  were  changing,    and  there  was  a 
sharp  reduction  in  the  amount  of  money  made  available  for  such  activities  as 
research  projects,    demonstration  projects ,    and  conferences.      Consequently, 


although  there  was  a  continuous  and  growing  expression  of  desire  for  another 
conference,    the  possibility  of  obtaining  the  money  needed  to  meet  conference 
expenses  seemed  remote.     During  this  period,   many  developing  plans  for  the 
distribution  of  compressed  speech  services  and  for  the  exploration  of  new  ap- 
plications of  compressed  and  expanded  speech  were  postponed  or  abandoned. 

Then,    in   1974,    the  American  Foundation  for  the  Blind  made  a  grant  of 
$3,  000  to  the  Perceptual  Alternatives  Laboratory,    University  of  Louisville, 
for  the  purpose  of  meeting  the  expenses  associated  with  the  Third  Louisville 
Conference  on  Rate -Controlled  Recorded  Speech.     A  conference  planning 
committee,    including  Dr.    Emerson  Foulke,    Director,    Perceptual  Alternatives 
Laboratory,    University  of  Louisville;  Dr.    Lawson  H.    Hughes,    Instructional 
Systems  Technology  and  the  Audio -Visual  Center,    Indiana  University; 
Dr.    Norman  J.    Lass,    Speech  and  Hearing  Sciences  Laboratory,    School  of 
Medicine,    West  Virginia  University;  Dr.   Murray  S.    Miron,    Psycholinguistics , 
Psychology  Department,    Syracuse  University;  and  Dr.    Sarah  Short,    College 
for  Human  Development,    Syracuse  University,    was  appointed,    and  with  their 
generous  assistance,    the  conference  was  planned  during  the  spring  and  sum- 
mer months  of  1975.     It  was  convened  on  November  4th  and  5th,    1975,    at 
Stouffer's  Louisville  Inn,    under  the  auspices  of  the  University  of  Louisville. 
The  conference  was  an  intellectual  as  well  as  a  social  success,    and  we  owe 
this  success,    in  large  part,    to  the  skillfully  articulated  organization  achieved 
by  the  staff  of  the  Perceptual  Alternatives  Laboratory,    under  the  able 
direction  of  Mrs.    Lela  Johns,    the  laboratory's  Program  Assistant. 


The  Conference  Program 

The  program  on  the  first  conference  day  was  to  have  been  initiated  by 
an  invited  address  from  Dr.   Sam  Duker.     Sam,    whose  work  in  the  area  of 
listening  has  earned  for  him  an  internationally  recognized  position,    has  ob- 
served and  recorded  almost  the  entire  history  of  rate -recorded  speech.     His 
anthology,    "Time-Compressed  Speech:    An  Anthology  and  Bibliography  in 
Three  Volumes,  "    is  the  best  single  source  of  information  about  time- 
compressed  and  time -expanded  speech.     Unfortunately,    he  was  prevented  by 
poor  health  from  attending  the  conference.      In  his  place,    Dr.    Emerson  Foulke 
reviewed  developments  in  speech  compression  technology.     Although  this 
review  may  have  been  helpful  to  members  of  the  audience  who  had  not  closely 
followed  the  developing  technology,    it  contained  no  information  that  had  not 
already  been  published  elsewhere,    and  his   remarks  have  therefore  not  been 
included  in  this  volume. 

Dr.   Murray  Miron  was  to  have  delivered  a  luncheon  address  on  the 
first  conference  day.     Unforeseen  circumstances  prevented  him  from  attending 
the  conference,    but  he  submitted  a  written  version  of  his  address,    and  it  is 
included  in  the  proceedings. 

The  program  of  the  second  conference  day  was  initiated  by  an  invited 

address  by  Dr.    Thomas  G.    Sticht,    Senior  Staff  Scientist,    HumRRO/ Western 

Division.     He  related  the  comprehension  of  time-compressed  speech  to  the 

context  of  more  general  problems  associated  with  comprehending  language 

in  spoken  or  written  form,    and  offered  valuable  criticisms  and  suggestions 

concerning  the  investigative  methodologies  employed  by  those  who  study  the 

comprehension  of  time -compressed  speech.     His  address   is   included  in  this 

volume. 
6 


The  conference  planning  committee  designated  a  coordinator  for  each 
of  three  areas   in  which  reports  were  solicited.      It  was  the   responsibility  of 
each  coordinator  to   review  those  papers   submitted  for  presentation  at  the 
conference  that  belonged  in  his  area,    to  evaluate  them  in  terms  of  criteria 
agreed  upon  by  the  conference  planning  committee,    to  make  the  final  selection 
of  papers  for  presentation  at  the  conference,    and  to  serve  as  chairman  of  the 
session  in  which  the   reports   in  his  area  were  presented.      From  this  process 
emerged  the  conference  program  that  is  described  briefly  in  the  following 
paragraphs. 
Reports  of  Technical  Research  and  Development 

Dr.    Daniel  S.    Beasley,    Audiology  and  Speech  Sciences,    Michigan 
State  University,    served  as  the  coordinator  for  this  area.     One  of  the  technical 
reports  was  made  by  Masahiro  Kosaka,    Chief  Engineer,    Wireless  Research 
Laboratory,    Matsushita  Electric  Industrial  CO.,    LTD.,    of  Japan.     He  dis- 
cussed some  aspects   of  circuit  design  which  must  be  considered  in  order  to 
maximize  the  intelligibility  of  time -compressed  speech.     Because  of  a  failure 
in  communication,    Mr.   Kosaka  was  not  aware  that  we  would  need  a  written 
version  of  his   report.     He  delivered  only  an  oral  report,    and  we  sincerely 
regret  that  we  have  not  been  able  to  include  his   report  in  the  conference  pro- 
ceedings. 
Reports  of  Basic  Research 

The  coordinator  of  this  area  was  Dr.    Lawson  H.    Hughes.     His  area 
included  reports   in  which  time  compression  was  the  independent  variable  of 
primary  interest,    reports   in  which  the  independent  variables  of  primary 
interest  were  those  which  can  affect  the  relationship  between  the  independent 


variable  of  time  compression  and  the  dependent  variables  of  intelligibility  and 
comprehension,    reports   in  which  intelligibility  and  comprehension  were  the 
dependent  variables  of  primary  interest,    and  reports   in  which  dependent 
variables  other  than  intelligibility  and  comprehension  were  the  variables  of 
primary  interest.      The  final  report  in  this  area  was  made  by  Mr.    Charles 
Meadows,    Director,    Foreighn  Language  and  Special  Learning  Laboratories, 
Morehouse  College,    who  discussed  some  of  the   requirements  for  a  theory  of 
time-compressed  speech. 
Reports  of  Applied  Research 

Dr.    Norman  J.    Lass  was  the  coordinator  of  activities  in  this  area. 
New  applications  of  time -compressed  and  time-expanded  speech  were   re- 
ported,   and  research  concerning  factors  that  must  be  taken  into  account  in 
applications  of  rate -controlled  recorded  speech  were  presented.      The  topics 
covered  included  the  effectiveness  of  time-compressed  speech  for  different 
populations  of  listeners,    the  use  of  time-compressed  speech  to  facilitate 
audiotutorial  instruction,    the  word  rates  preferred  by  listeners,    the  efficacy 
of  compressed  bimodal  presentations,    and  clinical  uses  of  time -compressed 
and  time -expanded  speech. 
New  Equipment  and  Materials 

In  addition  to  the  information  presented  in  the  formal  conference  pro- 
gram,   information  of  considerable  practical  importance  was  presented  by 
manufacturers  of  equipment  and  publishers  of  books  and   recordings.     One 
publisher,    The  Christopher  S.    Riley  Company,    displayed  examples  of  time- 
compressed  recordings  of  prose,    poetry,    and  drama  that  may  be  purchased 
from  the  company.     Dr.    Duker's  book,    "Time-Compressed  Speech:    An 

8 


Anthology  and  Bibliography  in  Three  Volumes,  "  was  exhibited  by  the  Scare- 
crow Press,    Incorporated. 

Several  manufacturers  exhibited  instruments  for  the  compressed  and 
expanded  reproduction  of  recorded  speech,    and  alternative  approaches  to  the 
incorporation  of  the  electronic  hardware  needed  to  add  the  functions  of  com- 
pression and  expansion  to  a  recorder  were  demonstrated.     Some  manufacturers 
showed  cassette   recorders  in  which  the  electronic  hardware  needed  for  com- 
pression and  expansion  was  included  as  an  integral  part  of  their  circuitry 
(Lexicon's  Varispeech  II,    Magnetic  Video's  Copycorder,    Panasonic,    and  the 
Sony  Corporation).     Other  manufacturers  showed  independent  units  containing 
only  the  electronic  hardware  needed  for  compression  and  expansion  (Hayward 
Manufacturing  Company's  Analog  Rate  Changer,    and  the  unit  marketed  by  the 
American  Printing  House  for  the  Blind  for  sale  to  those  who  qualify  as  clients). 
An  instrument  of  this  type  must  be  used  in  conjunction  with  a  cassette  recorder, 
tape  recorder,    or   record  player  that  has  been  modified  so  that  its  playing 
speed  is  continuously  variable.      Playing  speed  is  adjusted  to  achieve  the 
desired  word  rate,    and  pitch  is   restored  to  its  proper  value  by  adjusting  a 
control  on  the  external  unit.      This  arrangement  is  a  little  less  convenient  for 
the  operator,    since  controls  on  two  separate  units  must  be  adjusted  in  order 
to  obtain  the  desired  word  rate  at  the  correct  pitch.     However,    its  ability  to 
compress  or  expand  the  output  signal  of  any  tape  or  record  reproducer  with 
adjustable  playing  speed  is  an  advantage  in  some  situations.     One  company, 
LaBelle  Industries,    Incorporated,    exhibited  a  video  tape   reproducer  that 
includes  the  capability  of  compressing  or  expanding  the  speech  recorded  on 
the  audio  track,  of  the  video  tape  it  reproduces.      Thus  equipped,    it  allows  for 

9 


the  simultaneous  variation  of  the   rate  of  presentation  of  both  the  audio  display 
and  the  visual  display.     Another  company  with  a  booth  in  the  exhibitor's  area 
was  the  Cambridge  Research  and  Development  Group.     This   is  the  company 
that  developed  the  integrated  circuit  chip  for  the  process   it  calls  VSC  (variable 
speed  control)  that  is  used  in  the  instruments  sold  by  several  manufacturers. 

Prospects 

Examination  of  the  equipment  exhibited  at  the  conference  warrants  at 
least  one  firm  conclusion.      The  technology  needed  to  make  the  ability  to  vary 
the  word  rate  of  recorded  speech  a  practical  possibility  has  finally  been 
developed.      Compressors  with  good  signal  quality  that  are  compact,    easy  to 
operate,    and  relatively  inexpensive  are  now  readily  available.      Furthermore, 
if  enough  demand  for  these  instruments  develops,    the  savings   realized  by 
the  production  of  electronic  hardware  in  high  volume  will  permit  their  costs 
to  the  consumer  to  be  further   reduced  by  a  significant  amount. 

Until  now,    the  average  listener  has  not  had  easy  access  to  instruments 
that  can  compress  or  expand  recorded  speech.     Consequently,    researchers 
have  been  reduced  to  the  necessity  of  trying  to  learn  something  about  the 
perception  of  time -compressed  and  time -expanded  speech  by  assessing  the 
performance  of  listeners  who  have   had  little  or  no  prior  experience  with  such 
speech,    and  who  have  received  only  brief  exposures  during  the  experiments 
in  which  they  have  served  as  subjects.      With  instruments  for  the  compression 
and  expansion  of  recorded  speech  generally  available,    listeners  should  begin 
to  accumulate  experience  with  their  use,    and  at  the  Fourth  Louisville 
Conference  on  Rate -Controlled  Recorded  Speech,    we  can  expect  to  hear  reports 

10 


of  research  in  which  it  has  been  possible  to  consult  the  experience  of  listeners 
who  have  taken  extensive  advantage  of  the  ability  to  control  the  word  rate  of 
recorded  speech  for  serious  listening  purposes.     In  the  past,    since  the  dis- 
tribution of  recordings  that  had  been  compressed  at  a  central  facility  appeared 
to  be  the  only  feasible  way  of  making  compressed  speech  available,    a  good 
deal  of  research  effort  was  spent  in  the  attempt  to  determine  the  optimum 
word  rate  for  various  listening  purposes  and  listening  populations.      Now  that 
listeners  are  able  to  manage  word  rates  for  themselves,    they  will  be  making 
their  own  decisions  about  the  word  rates  that  are  appropriate  for  various 
kinds  of  reading  matter  and  listening  purposes.      By  consulting  the  experience 
of  these  listeners,    researchers  may  be  able  to  learn  something  about  the 
efficient  management  of  word  rate,    and  to  devise  training  programs  that 
teach  listeners  to  manage  the  word  rate  variable  in  a  way  that  maximizes  their 
comprehension. 


Emerson  Foulke,    Ph.D. 


11 


Calibration  Procedures  for  Time- Compres  sed/ Expanded 

Speech 

by  Konkle,    D.  F.  ,    Freeman,    B.  A.  ,    Riggs,    D.  ,    Riensche,    L.  L. 
&  Beasley,    D.  S. 


ABSTRACT 

The  purpose  of  this  paper  was  to  present  methods  and 
procedures  for  the  measurement  and  specification  of  speech 
signals  time  compressed/expanded  via  the  sampling  process. 
Both  the  mathematical  basis  and  the  instrumentation  necessary 
for  such  calibrations  were  detailed.   The  procedures  outlined 
in  this  paper  are  relatively  simple  and  can  be  adequately 
performed  in  field  work.   Hence,  should  provide  a  basis  for 
consistent  and  reliable  performance  of  electromechanical  and 
electrical  time  compressors/expanders. 


13 


TITLE:   Calibration  Procedures  for  Time- Compressed/Expanded 
Speech 

AUTHORS:   Dan  F.  Konkle,  MA 

Audiology  and  Speech  Sciences 
Michigan  State  University 
East  Lansing,  Michigan  48824 

Barry  A.  Freeman,  Ph.D. 
Audiology  and  Speech  Sciences 
Michigan  State  University 
East  Lansing,  Michigan  48824 

Donald  Riggs ,  AA 

Bruel  and  Kjaer  Instruments 

Pittsburgh,  Pennsylvania 

Linda  L.  Riensche,  MA 
Audiology  and  Speech  Sciences 
Michigan  State  University 
East  Lansing,  Michigan  48824 

Daniel  S.  Beasley,  Ph.D. 
Audiology  and  Speech  Sciences 
Michigan  State  University 
East  Lansing,  Michigan   48824 


ADDRESS  CORRESPONDENCE  TO  LAST  AUTHOR  (BEASLEY) . 
14 


INTRODUCTION 

Several  devices  are  currently  available  to  either  elec- 
tromechanically  or  electrically  time  compress  or  expand  speech 
stimuli.   Fairbanks,  Everitt,  and  Jaeger  (1954)  described  an 
electromechanical  instrument  that  employed  the  sampling  process 
via  a  continuous  tape  loop  that  passed  over  a  rotating  record 
head  assembly.   Since  its  introduction,  this  electromechanical 
processor  has  received  extensive  use  and  has  been  described 
numerous  times  in  the  literature.   The  utility  of  the  Fairbank's 
processor,  however,  has  been  restricted  primarily  to  laboratory 
use  because  it  was  both  expensive  and  bulky.   In  recent  years, 
therefore,  several  more  portable,  less  expensive  instruments 
have  become  commercially  available  for  the  purpose  of  electri- 
cally time  compressing  or  expanding  speech  materials.   Because 
of  their  portability,  these  smaller  devices  have  distinct  advan- 
tages in  field  work. 

During  the  past  four  years  the  Language  and  Speech  Percept- 
ion Laboratories  at  Michigan  State  University  has  used  the  Lex- 
icon Varispeech  I  portable  time  compressor/expander  in  several 
research  projects.   Briefly,  this  instrument  is  comprised  of  a 
portable  cassette  tape  recorder  with  a  normal  record  and  play- 
back speed  of  1  7/8  inches  per  second.   In  addition,  alternative 
playback  speeds  may  be  selected  ranging  from  one-half  (1001  time 
expansion)  to  two  and  one-half  times  (601  time  compression)  of 
normal  duration  (0%  time  compression/expansion).   Pitch  correct- 
ion of  the  rate  altered  signal  is  accomplished  by  a  miniatihre 

15 


computer  module  that  converts  the  playback  signal  to  digital 
data,  performs  sampling  operations  with  a  fixed  discard  interval 
of  25  to  30  msec. ,  and  reconverts  the  sampled  digital  data  back 
to  an  analog  speech  signal.   Both  playback  speed  and  pitch  cor- 
rection are  selected  by  a  single  external  control;  however,  the 
pitch  correction  process  may  be  by-passed  thereby  allowing  either 
slow  or  fast  play  frequency  shifting.   Frequency  shifted  material 
can  then  be  processed  through  the  compressor/expander  in  order  to 
restore  its  original  duration.   Hence,  the  Varispeech  I  allows 
for  individual  variation  of  either  the  time  or  frequency  para- 
meters, and  covariations  of  both  the  time  and  frequency  compon- 
ents of  speech  stimuli. 

Research  concerned  with  the  perception  of  time  altered 
speech  requires  that  time  and  frequency  parameters  be  strictly 
specified.   To  date,  however,  there  have  been  no  publised  reports 
directly  concerned  with  the  methodology  necessary  to  calibrate 
and  measure  the  output  of  electrical  and  electromechanical  time 
compressors/expanders.   The  purpose  of  this  paper,  therefore,  is 
to  present  procedures  developed  at  Michigan  State  University  to 
measure  and  specify  values  of  time  compressed,  time  expanded, 
frequency  shifted,  and  frequency  shifted  time/restored  speech 
stimuli.   Although  these  procedures  were  developed  with  the  Lexir 
con  Varispeech  I,  they  apply  to  any  device  that  uses  sampling 
procedures  to  time  alter  speech  material. 


16 


METHOD 

The  only  apparatus  required  to  perform  these  calibration 
measurements  are  a  tape  recording  of  a  1000  Hz  pure  tone  and 
either  an  oscilloscope  or  a  frequency  counter.   With  the  fre- 
quency counter  or  oscilloscope  connected  to  the  output  of  the 
time  compressor/expander,  variations  in  the  output  frequency  of 
the  1000  Hz  recording  obtained  as  the  result  of  changing  play- 
back speed  may  be  easily  monitered.   For  example,  when  playback 
speed  equals  record  speed,  the  expected  readout  on  the  frequency 
counter  or  oscilloscope  would  be  1000  Hz.   Conversely,  when 
playback  speed  is  double  the  original  record  speed,  the  output 
frequency  of  the  1000  Hz  recording  will  be  2000  Hz. 

Since  the  concept  of  proportional  frequency  shift  with  a 
given  change  in  playback  speed  is  the  basis  for  the  calibration 
measurements  discussed  in  this  paper,  it  is  important  to  consider 
the  source  of  the  1000  Hz  pure  tone.   Because  there  are  frequently 
slight  variations  between  the  record  and  playback  speeds  of  dif- 
ferent tape  recorders,  it  is  recommended  that  the  1000  Hz  tone 
be  recorded  directly  on  the  specific  time  compressor/expander  to 
be  calibrated.   In  this  manner,  changes  in  frequency  that  may 
result  from  different  speeds  between  two  separate  recorders  is 
eliminated.   In  the  same  manner,  speech  material  should  also  be 
directly  recorded  on  the  specific  time  compressor/expander  under 
use  before  future  processing  takes  place. 

Through  the  use  of  the  proportional  speed  change-  frequency 
shift  relationship,  it  is  a  relatively  simple  matter  to  adjust 

17 


the  playback  speed  of  the  1000  Hz  recording  (with  the  pitch 
corrector  turned  off)  until  output  frequencies  are  reached 
that  correspond  to  desired  changes  in  signal  duration.  Figures 
1  through  6  illustrate  various  formulae  that  may  be  used  to 
compute  the  specific  output  frequencies  necessary  to  obtain 
desired  amounts  of  time  compressed,  time  expanded,  and  fre- 
quency shifted/time  restored  stimuli. 

Figure  1  depicts  an  algebraic  equation  that  may  be  used  to 
obtain  a  given  level  of  time  compression.   It  should  be  noted 
that  time- compression  values  are  given  in  terms  of  percent  com- 
pression.  That  is,  if  the  speech  signal  is  compressed  so  that 
it  takes  301  less  playback  time,  the  total  signal  is  specified 
as  being  time  compressed  by  30%.   Thus,  in  the  example  of  the 
first  figure  the  desired  level  of  time  compression  has  been 
chosen  as  301.   The  equation  is  solved  by  100-30  =  1000 ,  or 

f  =  jq .   In  the  example,  therefore,  the  playback  speed 

of  the  processor  would  be  adjusted  so  that  the  output  read  on 
the  frequency  counter  or  oscillioscope  equals  1429  Hz,  or  301 
time  compression. 

Figure  2  shows  an  algebraic  equation  to  obtain  desired  fre- 
quency readings  for  time  expansion.   Again,  using  1001  as  unity, 
time  expansion  is  expressed  in  terms  of  percent  of  additional 
duration.   For  a  level  of  301  time  expansion,  the  equation  would 
be  solved  by  10°qq°  =  i^L2.,  or  f  =  769  Hz.   Hence,  in  order  to 
achieve  30%  time  expansion,  the  playback  speed  of  the  processor 
would  be  slowed  until  a  readout  of  769  Hz  was  reacjied. 


18 


0>i 

CM 

?' 

o 

h- 

II 

O 

1 

o 
o 

o 

#C/> 

c/> 

II 

ii 

■"" 

O. 

>> 

**- 

II 

e 

o 

c 

• 
* 

M— 

o 

<D 

o 

cr 

o 
o 

b 

E 

2> 

P 

o 

H- 

O 
O 
O 

CD 
O 

0> 

O 

o 

>> 

o 

O 

ro 

c 

II 

• 
• 
• 

Cl 

CL 

o 

ll 

II 

II 

J3 
o 

o 
o 

H— 

o 

O 

H— 

o 

h- 

<D 

h- 

i^. 

II! 

<£ 

o 
o 

S$ 

a> 

o 

I 

o 
o 

"1 

CL 

£ 

O 

c 

> 

ro 
l 

O 
O 

o 
o 

a 

X 

LU 

Figure  1.   Algebraic  equation  for  computing  the  output  playback 
frequency  of  a  1000  Hz  calibration  tone  that  will 
correspond  to  a  given  amount  of  time-compression. 


19 


100+ %TE      1000  ,  10 

,  or     f  = 


100  f       '  100+  %TE 


where: 

%TE=  percent  time  expansion 
f  =  playback  frequency 


Figure  2.   Algebraic  equation  for  computing  the  output  playback 
frequency  of  a  1000  Hz  calibration  tone  that  will 
correspond  to  a  given  amount  of  time  expansion. 


20 


Figure  3  indicates  the  calculated  frequencies  necessary  to 
obtain  levels  of  time  compression  or  expansion  for  values  from 
0%  to  601  in  10%  steps.   For  example,  for  501  time  compression 
the  adjusted  frequency  is  2000  Hz;  whereas,  for  501  time  expan- 
sion it  would  be  667  Hz.   Values  other  than  those  shown  in 
Figure  3  may  be  derived  for  reference  by  using  the  formulae 
illustrated  in  Figures  1  and  2. 

After  the  playback  speed  of  the  time  compressor/expander 
has  been  altered  to  provide  the  desired  amount  of  compression 
or  expansion  and  the  pitch  corrector  is  activated,  the  readout 
on  the  counter  or  oscilloscope   should  revert  to  1000  Hz.   Speech 
stimuli  may  then  be  substituted  for  the  pure  tone  signal  and 
subsequently  processed. 

Recall  that  the  Lexicon  Varispeech  I  is  only  capable  of 
time  compression  up  to  601,  although  time  expansion  values  are 
available  to  1001.   In  situations  where  it  is  necessary  to  time 
compress  or  time  expand  beyond  these  values,  it  will  be  neces- 
sary to  run  the  speech  sample  through  the  processor  more  than 
one  time.   It  is  important  to  realize  that  in  such  cases  the 
amounts  of  time  alteration  are  not  a  matter  of  simple  addition. 
That  is,  if  the  signal  were  time  compressed  by  301  during  the 
first  run  and  then  by  40%  in  the  second  run,  the  result  would 
not  be  70%  compression.   Instead,  the  total  amount  of  time  com- 
pression would  be  only  58%.   This  principle  is  illustrated  by 
reference  to  Figure  4. 


21 


RATIO 

COMPRESSION 

EXPANSION 

0 

1000 

1000 

20 

1250 

833 

30 

1429 

769 

40 

1667 

714 

50 

2000 

667 

60 

2500 

625 

Playback  frequencies  for  a  1000  Hz  tone 
necessary  to  reach  the  given  ratio  of 
time  alteration. 


Figure  3.   Computed  output  frequencies  for  a  1000  Hz  calibra- 
tion tone  for  amounts  of  time  compression/expansion 
from  0%  to  60%  in  10%  steps. 


22 


I 
O 

o 


M 
0s" 


O 
O 


O 


vP 

a** 

I 

O 
O 


I 

o 
o 

II 

Figure    4 


c 

O    _     C 

O    C    C 

°.Q  Q 

Q)  'CO     CO 
C    CO     CO 

Ics 

Q.  Q. 

O    E     E 

o  o 

111 

a>  o>   a> 
.±  o   o 

CD     %-     \- 

a>  a>  a> 

T3    cl  CL 


n 


■L  % 


O  O  O 

00s  0s   a^ 


O 

M 

o 

Sll 

ii 

ii 

o 
o 

s|i 

o 

O 

o 

II 

*- 

CD 

o 

xz 

1 

o 
o 

+ 
o 

o 

i 

<fr 

o 

o 

o 

eg 
P 

o 

1! 

o 

ro  CD 
ii 

H 

o 

cp 

CM 

o 

vP 

i 

CM 

O 

h- 

n 

p 

o 

.  ♦ 

o 

h- 

o 

• 

c 

<a- 

c5« 

CM 

*a 

1 

O 

-•— 
.a 

o 

+ 

5 

O 
CD 

o 

o 
o 

o 

o 

CD 

o 

•+— 

1 

1 

V- 

o 

O 

ii 

<D 

o 

o 

O 

O 

II 

ii 

| 

*  *  c 

o 

o 

<D  — 

Is- 

Is- 

Q. 

e 

a 

X 

Li] 

Algebraic  equation:  and  example  for  computing  the 
percentages  of  time  compression  when  it  is  necessary 
to  process  stimuli  more  than  once  to  obtain  a  given 
total  amount  of  time  compression. 


23 


The  upper  part  of  Figure  4  depicts  an  equation  for  compu- 
ting time- compression  values  when  it  is  necessary  to  process 
the  stimuli  more  than  one  time.   Notice  that  the  equation  is 
based  on  a  two  step  procedure,  with  each  step  computed  from 
unity  before  addition  of  the  separate  steps  takes  place.   As 
before,  total  amount  of  time  compression  in  the  figure  is  ex- 
pressed as  percent  of  discard. 

In  the  example  on  the  lower  half  of  Figure  4,  a  desired 
level  of  time  compression  has  been  chosen  equal  to  70%.   In 
order  to  solve  the  equation,  it  is  first  necessary  to  specify 
the  percent  of  compression  for  either  the  first  or  second  run. 
Thus,  40%  time  compression  has  been  arbitrarily  chosen  for  the 
first  run.   The  equation  may  then  be  solved  with  the  amount  of 
time  compression  for  the  second  run  equal  to  50%.   Hence,  the 
combination  of  these  two  values  will  result  in  a  total  of  70% 
time  compression.   It  would  also  be  possible  to  use  50%  compres- 
sion in  the  first  run  and  40%  in  the  second  and  still  obtain  a 
total  of  70%  time  compression. 

Figure  5  shows  a  similar  equation  that  may  be  used  to  obtain 
values  for  time  expansion  when  two  runs  are  necessary.   The  prin- 
ciples of  the  equation  are  the  same  as  those  just  noted  for  time 
compression.   As  before,  total  expansion  involves  a  two  step  pro- 
cedure with  each  level  of  expansion  computed  from  unity  before 
addition  of  the  two  levels  takes  place. 

In  the  example  associated  with  Figure  5,  120%  has  been  chosen 
as  the  total  amount  of  time  expansion  and  the  percent  of  expansion 


24 


•  o 
o 

I 


UJ 

+ 
o 
o 


CM 
UJ 


o 
o 


+ 

uT 

+ 
O 

o 


II 
UJ 


c 

o 

c 

CO 

c 

H3 

c 

13 

k- 

o 

V- 

"O 

X 

0> 

J/5 

CJ 

0) 

1 

c 
o 

o 

'(/> 

<o 

c 

c 

o 

a 

15 

Q. 

Q_ 

X 

X 

=t— 

0) 

Q> 

c 

3 
o 

e 

CD 

E 

Q> 

E 

o 

*— 

-•— 

cd  a>   a> 

.^=  o   o 

a)  i-   i- 

0)    0    0) 
-O    Q.   CL 

"    'L  % 
UJ  LlI  UJ 

2:nP  vP  so 

J)  O^    0s*    O^ 


O 
CD 

ii 

uT 


5 

O 

CJ 


c 
a 

o 


Q.    C 

e  ° 

UJ 


O 
O 


ii 

o 
o 


M 

UJ 

cr» 

o 
CO 


o 
o 


o 
to 

+ 

o 
o 


cm 
UJ 


o 
o 


+ 

o 

CD 
ii 

o 
o 


o 

CD 


O 
CD 


O 
CD 


ii 
UJ 


c        -f        ^    CD 


CM 
UJ 

CD 


O 
CD 

+ 
O 

o 


o 


CM 

UJ 


o 
o 


II 


o 
2  o 

CD 


+ 
o 

CD 


o 
g 

+ 
o 

CVJ 


Figure  5.   Algebraic  equation  and  example  for  computing  the  per- 
centages of  time  expansion  when  it  is  necessary  to 
process  stimuli  more  than  once  to  obtain  a  given  total 
amount  of  time  expansion. 


25 


for  the  first  run  has  been  arbitrarily  set  at  601.   Solution  of 
the  equation  indicates  that  the  amount  of  expansion  for  the 
second  run  necessary  to  obtain  1201  total  expansion  would  be 
approximately  37%.   Again,  it  would  also  be  possible  to  use  11% 
expansion  for  the  first  run  and  601  for  the  second  run  to  obtain 
the  same  total  expansion  of  120%. 

The  formulae  just  outlined  provide  the  basis  for  operations 
employed  in  the  laboratories  at  Michigan  State  University. 
Through  the  use  of  these  equations,  it  is  possible  to  derive 
essentially  any  given  level  of  time  compression  or  expansion. 
Furthermore,  the  formulae  may  be  combined  to  obtain  calibration 
values  for  other  procedures.   For  example,  in  a  research  pro- 
ject concerned  with  the  intelligibility  of  frequency  shifted/ 
time  restored  speech  in  aging  populations,  the  desired  values 
of  slow  playback  speed  and  time  compression  were  computed  by 
the  formula  illustrated  in  Figure  6.   Since  one  aspect  of  this 
project  was  to  examine  frequency  shifted  speech  that  had  been 
restored  in  time  to  original  duration,  the  value  for  D,  duration, 
was  set  to  01.   In  the  example  noted  in  the  lower  half  of  Figure 
6,  The  spectral  components  of  the  speech  stimuli  have  been  shift- 
ed downward  by  401.   In  order  to  restore  such  a  signal  to  normal 
duration,  the  equation  was  solved  for  percent  of  time  compres- 
sion.  As  can  be  seen,  approximately  201  compression  was  necessary 
to  bring  the  frequency  shifted  stimuli  back  to  normal  duration. 


26 


LU 


CO 

w. 

^— 

• 

• 

+ 
O 

ion  (0%  TC  o 
ency  shift 
compression 

5 

k- 

o 

c 

-(100+40)]  = 

8 

o 

o 
o 

c 

5 

o 
o 

5 
o 

+ 

o 

t 

III 

III 

J 

0 

u 

H 
C 

h 

original  durat 
percent  frequ 
percent  time 

0%  frequency  s 
restored 

1 

o 

+ 
o 
o 

1 
o 

1 

II 

o 
5: 

Po 

+ 

o 
o 

oo 

II 

o 

■ 

•  • 

CD 



Q  U_  I— 

to  4 
1  time 

o 

1 

o 
o 

i 

* 
• 

o 

1 

o 

%p  >J>  vp 

<r*  0s*  o^ 

a>2 

P5) 

o 

1 1 

o 
2 

o 

o 

"1 

O  iZ 

o 

II 

II 

n 
o 

II 

Q.- 

o 

<fr 

Q 

E 

$s 

o 

X 

w 

- 

UJ 

Figure  6.   Algebraic  equation  and  example  for  computing  the 
amount  of  time  compression  necessary  to  restore  a 
frequency  shifted  signal  to  original  duration. 


2? 


CONCLUSION 

In  summary,  it  is  hoped  that  the  procedures  and  formulae 
presented  in  this  paper  serve  to  emphasize  the  importance  of 
frequent  calibration  of  portable  time  compressors/expanders. 
Like  all  electronic  equipment,  these  instruments  are  susceptible 
to  damage,  to  component  age  and  malfunction,  and  to  inconsis- 
tencies due  to  normal  use.   Portable  time  compressors/expanders 
may  be  even  more  vulnerable  to  such  inconsistencies.   Conse- 
quently, in  order  to  assure  consistent  and  reliable  performance, 
it  is  important  that  portable  processors  receive  frequent  cal- 
ibration checks.   Since  the  procedures  outlined  in  this  paper 
are  relatively  simple  and  can  be  adequately  carried  out  in  field 
work,  it  appears  they  provide  an  adequate  basis  for  such 
monitoring. 


28 


REFERENCES 

Fairbanks,  G. ,  Everitt,  W. ,  and  Jaeger,  R. ,  Methods  for  time 

or  frequency  compression- expansion  of  speech.   Inst,  radio 
Engrs.  Trans,  prof.  Grp.  Audio,  AU-2,  7-12  (1954). 


29 


Musical  Pitch  Shifting:     A  Novel  Application  of  Technology 
Useful  in  Rate  Control 
by  Koch,    R.  F. 


ABSTRACT 

The  essential  difference  between  rate- controlled  and  speed-controlled 
playback  of  audio  is  well  known.     An  apparatus  embodying  this  difference  is 
a  pitch  changer.      The  pitch- changing  capability  can  be  applied  to  music, 
producing  significant  new  effects.      Specific  problems   relative  to  this  application 
are  discussed,    and  a  tape  is  played  to  demonstrate  some  of  these  effects. 


31 


I  want  to  talk  to  you  today  about  the  reverse  of  rate-controlled 
playback,  namely,  deliberate  pitch  change. 

To  begin  with,  allow  me  to  propose  a  definition:   Speed-controlled 
playback  of  audio  is  playback  in  which  the  speed  is  varied,  and  there 
is  no  further  modification  performed  on  the  signal.   Clearly,  this 
differs  significantly  from  rate-controlled  playback,  since  the  latter 
adds  the  important  modification  of  pitch  correction.   One  way  in  which 
such  pitch  modification  can  be  described  is  that  it  multiplies  all 
components  of  the  spectrum  of  the  signal  by  a  selected  constant. 
Nominally,  the  multiplying  constant  should  be  the  inverse  of  the  con- 
stant by  which  the  playback  speed  is  multiplied  relative  to  the  recor- 
ding speed.   As  a  practical  matter,  individual  listener  preferences 
will  often  require  that  the  pitch-modification  constant  be  slightly 
different  from  the  nominal  value,  but  the  important  consideration  is 
that  the  pitch-modification  capability  be  flexible  and  have  a  range 
equal  to  the  inverse  of  the  speed  range.   Thus,  the  pitch-modification 
capability  of  a  rate  changer  for  speech  should  cover  a  range  of  typi- 
cally 2x  increase  to  2^x  decrease,  with  even  greater  range  desired 
in  some  special  cases. 

In  musical  terms,  a  2x  increase  in  pitch  signifies  an  up-shift 
of  one  octave,  and  a  2^x  decrease  signifies  a  down-shift  of  almost 
one-and-one-third  octaves.   This  brings  us  to  the  central  subject  of 
this  paper,  that  is,  consideration  of  how  a  pitch-shifter  can  be  used 
for  musical  purposes.  Such  wide 

shifts  as  one  or  more  octaves  are  probably  excessive  from  a  musical 
point  of  view. 

On  the  other  hand,  narrower  ranges  of  pitch  change  have  been  invoked 
by  musicians  for  many  centuries;  the  use  of  electronic  technology 
for  this  purpose  is  only  a  new  chapter,  not  a  new  book. 

Pitch-shifting  in  music  has  at  least  two  functions.   One  of 
these  is  transposition  of  scales;  the  other  is  modification  of  the 
usual  sounds  of  instruments  or  human  voices.   Transposition  of  scales 
has  been  performed  in  a  number  of  ways,  and 'the  history  of  doing  this 
goes  far  back  into  that  of  music.   Undoubtedly,  the  earliest  case  of 


32 


transposition  occurred  when  one  singer  wished  to  render  a  song  com- 
posed for  another's  voice.   The  second  singer,  having  a  different 
register  from  the  first,  modified  the  music  in  such  a  way  that  the 
relative  frequencies  of  the  sung  tones  remained  the   same  as  in  the 
original,  but  the  absolute  values  of  these  frequencies  were  multiplied 
by  some  common  constant.   Right  down  until  today,  this  remains  a  very 
common  activity,  especially  among  amateurs  wishing  to  join  in  a  croup 
sing.  Of  course,  the  mathematical  and  physical  characteristics  of 
this  modification  are  not  generally  recognized.   Certainly  they  are 
not  explicitly  recognized;  to  do  so  would  probably  only  make  a  naturally 
easy  operation  difficult. 

In  the  case  of  instrumental  music,  transposition  is  somewhat  more 
difficult  than  it  is  for  the  voice.   Some  of  the  simpler  stringed 
instruments  can  be  retimed  quite  easily.   At  the  other  extreme,  the 
tuning  of  some  instruments,  such  as  cymbals,  is  fixed  at  the  time  of 
manufacture. 

Generally,  transposition  is  fixed  over  the  duration  of  a  compo- 
sition.  The  purpose  of  the  transposition  is  usually  to  modify  a  con- 
dition that  is  static,  over  time  periods  long  compared  to  the  time  of 
performance.  For  example,  it  is  recorded  that  on  one  occasion  "Beet- 
hoven, having  to  play  his  Concerto  in  C  major,  and  finding  the  piano 
half  a  tone  too  flat,  transposed  the  whole  into  C#  major"  (Grove's 
Dictionary  of  Music  and  Musicians,  v,  143).  This  was  a  remarkable 
feat,  and  one  which  few  persons  could  execute.  However,  we  should 
note  that  the  change,  once  elected,  was  maintained  throughout  the 
performance.  The  significance  of  this  in  the  present  context  is  that, 
although  what  Beethoven  accomplished  was  exceedingly  difficult,  his 
difficulty  was  not  compounded  by  having  to  vary  the  pattern  of  change 
with  which  he  began.  Because  there  are  so  few  Beethovens,  designers 
of  musical  instruments  have  from  time  to  time  developed  mechanisms 
to  assist  lesser  instrumentalists  in  producing  similar  effects. 

In  these  days  of  profilerating  electronic  devices,  it  is  reason- 
able to  expect  that  transposition  might  be  performed  by  this  new  art. 
The  expectation  has  been  realized  in  the  recording  industry  by  a 
special  case  of  the  technique  of  speed-controlled  recording.   For 
example,  suppose  that  it  is  desired  to  transpose  the  tone  of  a  per- 
cussion instrument.  Retuning  varies  from  difficult  to  impossible. 
The  solution  is  that  the  instrumentalist  plays  at  a  tempo  different 

33 


from  that  desired  for  the  version  of  the  recording  that  is  finally 
to  be  issued.   This  one  rendition  is  recorded  alone,  as  is  frequently 
the  case  in  any  event,  when  popular  music  is  recorded  in  a  studio. 
When  this  rendition  is  combined  with  others,  to  form  an  ensemble 
recording,  the  basic  recording  is  played  faster  or  slower  than  its 
original  recording  speed.   This  speed  change  is  permissible  because 
the  tempo  was  adjusted  at  the  time  of  the  original  playing  to  take 
into  account  the  later  change.   Thus,  the  ultimate  tempo  is  the  one 
desired.   In  addition,  the  speed  change  upon  playback  produces  a 
change  in  pitch,  which  is  also  desired.   Evidently,  this  operation 
imposes  severe  technical  burdens  upon  at  least  the  instrumentalist 
and  the  recording  engineer.   It  is  probably  difficult  also  for  the 
composer  and  the  arranger,  who  are  forced  to  considerable  flights 
of  their  imaginations.  Nevertheless,  these  difficulties  can  all  be 
overcome.  On  the  other  hand,  there  is  at  least  one  difficulty  in- 
herent in  this  process  which  is  irremediable.   That  difficulty  is 
that  the  process  cannot  be  performed  without  a  lapse  in  time  between 
its  initiation  and  its  final  accomplishment.   In  other  words,  it  is 
not  a  real-time  process.  Because  of  this  limitation,  it  is  inappli- 
cable to  live  performances. 

In  contrast  to  this  cumbersome  technique  for  changing  pitch,  a 
pitch  compensator  that  is  at  the  heart  of  a  rate-controlled  playback 
system  performs  the  function  in  real  time  and  without  any  effort  on 
the  part  of  the  user.   Thus  the  pitch  compensator  appears  to  be  a 
most  attractive  device  for  recording  studios  and  for  live  performances 
of  music.  Please  note  that  I  said  "appears  to  be".  The  reason  for 
this  hedging  is  that,  by  the  Law  of  the  Natural  Recalcitrance  of 
Inanimate  Objects,  pitch  compensators  heretofore  available  have  had 
shortcomings  which  made  them  unsuitable  for  music. 

In  general,  pitch  compensators  use  the  technique  of  dividing  the 
signal  into  short  segments,  discarding  or  repeating  a  portion  of  each 
segment,  and  reassembling  the  modified  segments  in  their  original 
order.   An  exception  to  this  approach  is  the  Harmonic  Compressor, 
but  its  compensation  ratio  is  necessarily  fixed,  and  this  represents 
a  severe  impediment  to  practical  use  for  music.   The  segmenting 
pitch  compensators  tend  to  introduce  discontinuities  in  their  outputs. 
As  a  result  of  the  discarding  or  repeating  of  portions  of  segments  of 
the  input  signal,  the  output  signal  consists  of  a  train  of  sub  segments 

34 


wherein  junctions  of  successive  subsegments  are  subtly  different  from 
the  smooth  flow  of  natural  sound.   In  speech  processing,  the  listener 
is  primarily  concerned  with  information  content,  and  has  a  rather 
high  tolerance  for  disturbances  such  as  distortion  and  noise.   This 
is  not  to  say  that  such  disturbances,  or  others,  can  be  ignored  by 
the  designers  of  speech  processors,  nor  have  they  been. 

Finally,  we  have  come  to  the  central  issue  —  how  must  a  pitch 
shifter  for  music  differ  from  a  shifter  for  speech?  There  are  three 
broad  areas  that  must  be  considered.  These  are  signal-to-noise  ratio, 
dynamic  range,  and  frequency  range.   A  very  exact  delineation  of  the 
problem  would  set  forth  more  areas,  but  I  have  chosen  for  the  sake 
of  simplicity  to  join  a  number  of  considerations  under  the  single 
heading  of  signal-to-noise  ratio.  Broadly,  signal-to-noise  ratio 
describes  the  ratio  of  desired  signal  to  undesired  audible  material 
that  accompanies  it.  Technically,  signal-to-noise  ratio  has  a  nar- 
rower meaning,  with  undesired  elements  such  as  distortion  products 
being  considered  separately  .from  noise,  but  the  umbrella  approach  is 
a  convenient  one.  From  this  point  of  view,  then,  the  signal-to-noise 
ratio  for  music  must  be  significantly  better  than  for  speech;  since 
the  use  of  an  electronic  pitch  shifter  for  music  is  still  a  budding 
technique,  the  required  degree  of  improvement  is  yet  to  be  quantified 
with  exactness.  This  lack  of  quantification  applies  to  some  extent 
to  the  other  parameters.  However,  it  can  be  said  in  general  terms 
that  dynamic  range  of  the  order  of  30  db  minimum  is  required  for 
speech  (and  this,  too,  is  indefinite),  while  the  dynamic  range  for 
music  should  be  extended  to  at  least  70  db,  and  preferably  more.   In 
similarly  general  terms,  the  frequency  range  for  speech  should  be  in 


35 


the  order  of  200  to  4,000  Hertz;  for  music,  it  should  cover  at 
least  50  to  7,500  Hertz.   I  am  sure  that  some  persons  will  say 
that  these  frequency  limits  are  quite  inadequate .   The  tape 
that  you  are  about  to  hear  demonstrates  musical  pitch  shifting, 
using  a  pitch  shifter  that  is  still  under  development  and  will  be 
announced  commercially  this  winter.   I  bring  you  this  presentation 
now,  because  I  think  that  even  at  this  stage,  it  is  of  great 
interest  to  a  technically  sophisticated  audience.   Note,  in  this 
tape,  that  there  is  great  dynamism  in  pitch  shifting,  to  an  extent 
that  would  be  beyond  the  capabilities  of  even  a  Beethoven,  relying 
on  a  mechanical  instrument. 

This  tape  was  produced  through  the  use  of  the  Sound  Workshop 
Model  500  Real  Time  Pitch  Shifter.   I  acknowledge  with  thanks  the 
efforts  of  Mr.  Paul  Galburt  and  Mr.  Michael  Colchamiro  of  the 
Sound  Workshop  in  producing  and  furnishing  this  tape. 


36 


Reaction  Time  in  Identifying  Time -Compressed  Words 
by  Hughes,  L.  H.  ,  &  Foulke,  E. 


Abstract 
An  intelligibility  experiment  was  performed  to  determine  whether  reaction 
time  in  identifying  isolated  words  increases  with  increasing  time  com- 
pression. Each  of  29  subjects  listened  to  500  words  at  one  of  five  values 
of  compression.   Reaction  time  did  increase  with  increasing  compression. 
The  results  were  interpreted  as  indicating  that  at  least  a  partial  ex- 
planation of  the  decrease  in  comprehension  of  connected  speech  as  com- 
pression is  increased  is  that  listeners  require  more  processing  time. 


37 


REACTION  TIME  IN  IDENTIFYING  TIME-COMPRESSED  WORDS 
Lawson  H.  Hughes*  and  Emerson  Foulke** 

As  Foulke  and  Sticht  (1967a)  said  in  a  paper  read  at  the  first 
Louisville  conference  on  compressed  speech,  'The  perception  of  speech 
entails  the  registration,  encoding  and  storage  of  speech  information, 
and  these  operations  require  time.  When  the  word  rate  is  too  high,  words 
cannot  be  processed  as  fast  as  they  are  received  with  the  result  that  some 
of  the  words  and  their  associated  meanings  are  lost  (p.  19)."  Of  course 
they  were  referring  to  the  perception  of  continuous  speech,  not  just 
isolated,  separate  words.   Another  common  way  of  implying  the  same  fact  is 
to  say  that  they  were  referring  to  research  concerned  with  comprehension 
rather  than  research  concerned  with  intelligibility. 

Now,  of  course  in  an  intelligibility  experiment  listeners  are  much  less 
restricted  in  the  time  they  have  available  for  registering,  encoding,  and 
storing  speech  information  because  the  procedure  involved  the  presentation 
of  successive  words  at  intervals  of  time  of  the  order  of  several  seconds. 
In  such  an  experiment  it  seems  likely  that  after  one  or  two  seconds  the 
listener  can  identify  a  word  he  has  just  heard  as  accurately  as  he  could 
no  matter  how  much  additional  time  he  was  given.   However,  in  the  case  of 
connected  speech,  as  Foulke  and  Sticht  (1967a)  implied,  if  registration, 
encoding,  or  storage  of  a  word  or  a  string  of  words  is  still  in  progress 
when  other  words  demand  attention,  it  is  inevitable  that  some  words  will 
not  be  perceived. 

This  difference  between  experiments  concerned  with  intelligibility 
and  experiments  concerned  with  comprehension  is  illustrated  in  Figure  1. 


38 


The  content  of  Figure  1  and  the  explication  of  that  content  are  intended 
to  be  neutral  with  respect  to  theories  of  language.   If  some  of  the  words 
in  a  presentation  are  not  perceived  correctly,  this  is  a  relevant  fact  for 
any  theory  of  language.   In  the  upper  part  of  Figure  1,  Part  A,  the  time 
relations  are  shown  for  an  intelligibility  experiment.   As  shown  in  the 
figure,  the  time  it  takes  a  listener  to  identify  a  first  word  extends 
somewhat  beyond  the  time  it  takes  to  present  the  word  to  him.   However,  he 
has  ample  time  to  prepare  for  the  onset  of  a  second  word.   In  the  lower 
part  of  Figure  1,  Part  B,  the  time  relations  are  shown  for  a  comprehension 
experiment.   As  in  Part  A,  the  time  it  takes  a  listener  to  identify  a 
first  word  extends  beyond  the  time  it  takes  to  present  the  word  to  him,  but 
unlike  Part  A,  this  time  period  overlaps  with  the  time  during  which  a  second 
word  is  being  presented.   That  is  a  way  of  saying  that  the  listener  is 
still  identifying  the  first  word  when  the  second  word  has  its  onset. 
Therefore,  even  if  it  should  be  the  case  that  the  time  compression  of  words 
does  not  lower  the  speed  of  identifying  each  of  them  separately,  the  listener's 
task  becomes  more  difficult  as  compression  is  increased,  for  the  words  occur 
in  increasingly  rapid  succession  and  hence  the  time  available  for  identi- 
fication is  reduced.   Of  course,  we  have  just  implied  that  a  favorable 
case  for  comprehension  would  be  that  where  the  time  compression  of  words 
does  not  lower  the  speed  of  identifying  them,  whereas  of  course  the  best 
possible  case  would  be  that  where  the  time  compression  of  words  increases 
the  speed  of  identifying  them.  With  such  an  increase,  the  1 istener  should 


39 


Figure  1.  A  simplified  diagram  of  time  relations  between  presentation 
and  identification  of  words  in  intelligibility  experiments  and  in  connected 
speech  experiments. 


PART   A.       Intelligibility    Experiment 


Presentation   of 
First   Word 


Presentation   of 
Second   Word 


I — H 


h— H 


Reaction  Time 
to  First  Word 


Reaction  Time 
to  Second  Word 


REST 
PERIOD 


PART  B.   Connected  Speech  Experiment 


Presentation  of   Presentation  of 
First  Word       Second  Word 


Reaction  Time 
to  First  Word 


H'h 


In  the  time  interval  t  the 
listener  is  doing  two 
things  --  completing  his 
identification  of  the  first 
word  and  beginning  his 
identification  of  the 
second  word. 


40 


be  more  nearly  able  to  maintain   the    level   of   identification   that  he 
achieved  when  the  words  were  not  time  compressed. 

One  might  be   inclined   to  say   that    it    is  unthinkable  that    listeners 
could   identify  compressed  words  more  quickly   than   uncompressed  words. 
However,   consider   the     possibility   that  what  we  might  call    an    imitative 
response  occurred-- i .e. ,   the    listener  might    respond  more  quickly  when   the 
word   is   said  more  quickly   for  him.      In   fact  some  evidence  of  such   an  effect 
was   found   in   a  study  by  Shriner  and  Sprague    ( 1 969)   where  children    identi- 
fied which  one  of  three   line   drawings,   all    shown  at   the   same  time,   corre- 
sponded to  a  word  they  had  just  heard.      Normal    speech  was   time  compressed 
by  three  amounts,   and  mean    reaction   time  was  significantly   shortest  at   the 
intermediate  one  of  these   three  values.      Further,    it  was  only  as  amount  of 
compression    increased  beyond  this  middle   value  that  a   substantial    increase 
in   mean  number  of  erroneous    identifications  of  the  words  occurred.      This 
latter  fact   suggests   that   a  necessary  condition   for  the    imitative    response 
is   that   the  words  be   rather  easily    identifiable. 

The  study  by  Shriner  and  Sprague    (1969),    in   fact,    is   apparently   the 
only  prior   intelligibility  study   in  which    reaction   time  was    recorded  under 
several    values  of  time  compression  of  normal    speech.      Shriner,   Beasley, 
and  Zemlin    (1969)    studied  children's    reaction   time   to  frequency-divided 
speech,   and  the   frequency-divided  speech  was   also  time   compressed   by   two 
amounts.      Mean    reaction   time    increased  with    increasing  time  compression, 
but   the  only  significant  differences   found  were  those  between  normal 
speech  and  each  of  the   frequency-divided  conditions.      Of  course   these 
increases  may  have  been   unique   to  frequency-divided  speech.      Hecker, 


41 


Stevens,  and  Williams  (1966)  found  that  adults'  errors  and  reaction 
time  in  an  intelligibility  experiment  generally  increased  as  the  signal- 
to-noise  ratio  decreased,  although  no  test  of  statistical  significance 
was  made. 

However,  the  concern  in  the  present  study  was  with  the  effect  spe- 
cifically of  time  compression  of  normal  speech  on  the  accuracy  and  speed 
of  recognition  of  separate,  isolated  words.   Previous  research  has  provided 
ample  evidence  as  to  the  general  effect  of  compression  on  accuracy  of  rec- 
ognition—i  .e.  ,  intelligibility  (e.g.,  Garvey,  1953).   Intelligibility 
decreases  slowly  at  first  as  compression  increases,  and  only  at  a  rather 
high  level  of  compression  does  it  begin  to  decrease  rapidly.   On  the  other 
hand,  as  indicated  earlier,  rather  little  is  known  about  the  effect  of  com- 
pression on  reaction  time,  and  the  primary  purpose  of  the  present  study  was 
to  add  to  information  about  this  function.   It  will  be  recalled  that  the 
significance  of  information  about  this  function  is  that  it  may  throw  some 
light  on  the  question  of  why  comprehension  of  connected  speech  decreases 
as  compression  increases. 

Method 

Subjects.      Twenty^nine  students    in    introductory  psychology  classes 

Louis  villc 
at   the  University  of  Kwrtuelty  participated   in   the  experiment  as  a  way  of 

satisfying  a  course   requirement.      None  of  the  students  had  participated 

in   any  other  experiments    in   compressed  speech. 

Recorded  material .     Ten  sets  of  50  monosyllabic  words  were  used.      These 

sets  were  the   first    10  of  the  20  sets  of  words  prepared  by  Egan    (19^8). 

Each   set    is  said  to  be  phonetically  balanced  and  also  to  be  very  nearly  as 

difficult   as  each  of  the  other  sets    in   terms  of  articulation  test    (Intel  11- 

42 


glbility)  scores.   The  words  in  each  of  the  10  sets  separately  were 
ordered  randomly.   Then,  the  10  sets  were  recorded  on  one  track  of  an 
audio  tape  in  the  set  1  to  set  10  order  by  a  professional  reader.   The 
onset  of  successive  words  occurred  at  approximately  5_sec  intervals, 
including  the  first  word  of  each  set  after  the  first--i.e.,  there  was 
no  additional  pause  time  between  successive  sets.   Subsequently,  an  370 
Hz  tone  of  approximately  1-sec  duration  was  recorded  on  the  second  track 
of  the  same  tape  prior  to  each  word  to  serve  as  a  ready  signal.   It 
terminated  each  time  approximately  1  sec  prior  to  the  onset  of  the  word 
it  preceded. 

The  500  words  were  time  compressed  to  one  of  five  values  on  the  Graham 
Whirling  Dervish  in  the  Perceptual  Alternatives  Laboratory  at  the  Univer- 
sity of  Louisville.   These  values,  expressed  in  percentage  of  original 
signal  remaining,  were  100,  *»1  ,  32,  29,  and  25.   The  discard  interval  was 
20  milliseconds  (ms) .   The  interval  between  the  onset  of  successive  words 
that  existed  in  the  original  recording  (viz.,  5  sec)  as  well  as  the  location 
and  duration  of  the  ready  signal  were  maintained  in  each  of  the  compressed 
recordings. 

Apparatus.  The  subject  sat  in  a  Tracor  Audiometric  and  Medical  Research 
Room,  which  appeared  to  be  soundproof  under  the  conditions  of  this  experi- 
ment.  The  room  was  6  ft  square  and  7  ft  high.   In  attempting  to  identify 
words  he  heard,  the  subject  spoke  into  a  Neumann  Model  KM84I  condenser 
microphone.   He  wore  Sharpe  Model  HA  10A  stereo  headphones.   He  could  adjust 
the  volume  level  in  his  headphones  by  turning  a  knob  on  an  attenuator  lo- 
cated on  a  table  in  front  of  him.   The  words  and  ready  signals  (tones)  were 


43 


played  from  a   Revox  tape   recorder.      Its  outputs  went   through   a  bridging 
network  to  an   audio  amplifier  and   thence  to  the   subject 's  attenuator. 
The  output  of   recorded  words   from  the   Revox  went  also  to  a  Denon   tape 
recorder  and  was    recorded  on  one  track  of   its  tape.     The  output  of   re- 
corded tones   from  the   Revox  went   also  to  a  Shure  Model    M67  Professional 
Microphone  Mixer,  with  a  second   input   to  the  mixer  being   from  the  subject's 
microphone.      The  output  of  the  mixer  went   to  the  Denon,  where   the  subject's 
responses   and  the   tones  were   thus    recorded  on   the  second  track  of   its   tape. 
With   the  words    recorded  on  one   track  of  the   Denon   and   the   subject's    responses 
and  the   tones  on  the  other  track,    it  was   possible  to  measure   the  subject's 
reaction   times   at  a   later  time.      The   tones  were    recorded  on   the   same   track 
as   the  subject's    responses  were    in  order  that   the  tone  would  stop  a   reaction 
time  clock    in   case   the  subject  had   failed  to   respond.      A  pair  of  BRS  Model 
202   Schmitt  Triggers  were    incorporated   in   a  system  of  BRS    logic  modules  that 
enabled  the  measurement  of   the   time   from  the  onset  of  a  word  to  the  onset  of 
the  subject's    response.      This   time    (reaction   time)   was  measured  and  displayed 
to  the  nearest  ms  by   a  Hunter  Model    1520   Digital   Timer.      The  audio  output  of 
the   Denon  enabled   the  experimenter  to  determine   the  accuracy  of  the  subject's 
spoken    response   concurrently  with   the  measurement  of   reaction   time. 

Procedure.      Subjects   participated  one  at  a  time.      They  were  assigned 
randomly   to  one  of  the   five  compression  values,  with   the    restrictions   that 
all    five  compressions  were  used  once  each  before  any  compression  was   used 
a  second  time,   that  all  were  used  twice  before  any  was   used  a   third   time, 
etc.      When   the  subject  arrived,   he  was  seated   in   the  Tracor  Room   in   front  of 
the  microphone  and  attenuator.      Then,   the  experimenter   read  the   following 


44 


instructions: 

I'd   like   for  you  to   listen   to  a  number  of  words   that  we  have   tape 
recorded.     The  words  will   be  presented  one  at   a   time,   and  your  task 
is  simply   to  say  out    loud  what  you  heard  each   time.      That    is,    I'll 
play  a  word   for  you  and  then  you  say  what  you   think  the  word  was. 
So  that  you  will   know  when   a  new  word   is   about    to  occur,   you  will 
hear  a  brief  tone  just   before  each  word.      Now,    the  most    important 
thing   is   that  you  say  the  word  as  quickly  as  you  can  after  you  hear 
it.      You  may  think  that    if  you  answer  very  quickly,   you  will   say  the 
wrong  word.      But   that    isn't   true.      We  have   found   that   people  who   really 
pounce  on   the  word  are  almost  as  accurate  as   those  who  answer  more    lei- 
surely.     So  don't  worry  about  whether  you  might  make  an  occasional 
mistake.      In  most  cases  you'll    find   that   the  word  just   sort  of  comes 
out  automatically.      To  give  you  an    idea  of  how  quickly  you  should 
respond,    I'd   like   for  you  to  give  me  a  simple  word   like   the  name  of  an 
object,   and    I'll    respond  as  quickly  as    I    can.      (The  experimenter 
shadowed   the   subject's    response — i.e.,   he   attempted  to  start    repeating 
the  subject's    response  before    it  was   complete,   aided  by   lip   reading. 
He   repeated  this   procedure   for  the   two  succeeding  words.)      Now  give  me 
another  word.    ...      OK,   give  me  one  more  word.    ...      Now   I'll   give  you 
some  words   and  see  how  quickly  you  can    respond.      Shop.      (The  experi- 
menter waited  for  the  subject's    response  and  either  urged  him  to   respond 
more  quickly  or  told  him  that  he  was  doing  well.      He   repeated  this 
procedure   for  the   four  succeeding  words  be 11  ,   sack,   park,   and  teach.) 
OK,   that's   the  way    I'd   like   for  you  to  respond.      Really  pounce  on   the 


45 


word  when   you  hear   it.      V/e  have   an  electronic  clock  that  we'll    use 
to  measure  how   fast  you   respond  to  the  nearest  hundredth  of  a  second. 
By  the  way,   try  not  to  make  noises    like  tapping  on   the  table  because 
the  microphone  will    respond   to  that   sort  of  thing  just    like    it  will 
to  your  voice.      Do  you  have  any  questions  about  what  you  should  do? 
(Questions   about   the  procedure  were  answered  ad   lib.      Subjects  were 
asked  to  defer  other  questions   until    later.)      I'll   go  outside  now 
and  we'll   get   started.      (However,   before  the  experimenter   left   he    in- 
structed the   subject    in   the   use  of  the  attenuator  for  a  comfortable 
listening   level   and  assisted  him  in  adjusting  his  headphones.) 
The   door  to  the  subject's   room  was   then   closed,   and  the  500  words,  each 
preceded  by  a  tone,  were  presented  binaural  ly  through   the  subject's 
headphones.      The  subject's   spoken    responses  were   recorded   in   such  a  way 
that   both   their  accuracy  and   reaction   time  could  be  determined  at   a   later 
time,   as   described  earlier   in   the  Apparatus   section. 


Results 
Errors.      The  mean  number  of  errors  made  over  all    500  words  was  determined 
for  each   compression   value  separately.      These  means    for  groups  with  percentages 
of  original   signal    remaining  of   100,   41  ,   32,  29,   and  25  were, respectively, 
5.5,   6.2,   9.4,    10.4,   and   17.9.      Thus,   as  amount  of  compression    increased, 
mean  number  of  errors    increased  consistently.     This    result    is  presented 
graphically   as   the  solid    line    in   Figure  2.      In   keeping  with  prior  studies, 
errors    increased  slowly  at   first  with    increasing  amount  of  compression , but 

46 


Figure  2.   Reaction  time  and  number  of  errors  as  a  function  of  percentage 
of  the  original  time  in  which  each  word  was  read  and  recorded. 


25r- 


25  50  75 

PERCENTAGE    OF    ORIGINAL   RECORDING    TIME 


47 


when  less  than  41  per  cent  of  the  orinal  signal  remained,  errors  in- 
creased more  rapidly. 

It  seemed  of  interest  also  to  determine  whether  mean  number  of 
errors  changed  over  the  10  sets  of  phonetically-balanced  words.   A 
question  that  is  continually  raised  is  the  extent  to  which  people  can 
learn  to  listen  more  effectively  to  compressed  speech  simply  by  practice 
in  listening,  although  ordinarily  the  question  is  asked  about  connected 
speech  rather  than  isolated  words  (Foulke  and  Sticht,  1967a).  The  mean 
number  of  errors  made  on  each  set  of  50  words  over  all  values  of  com- 
pression, beginning  with  the  first  set  and  ending  with  the  tenth  set,  were 
13. A,  11.3,  10.4,  9.4,  9.1,  7.3,  7.6,  7.3,  7.8,  and  6.8.  Thus,  as  number 
of  words  listened  to  increased,  errors  generally  decreased.  This  result 
is  presented  graphically  as  the  solid  line  in  Figure  3.   By  the  tenth  set 


of  50  words,  the  mean  number  of  errors  over  all  values  of  compression 
(viz,  6.8)  was  not  substantially  greater  than  the  mean  number  of  errors 
over  the  entire  500  words  made  by  the  group  for  whom  100  per  cent  of  the 
original  signal  was  used  (viz.,  5.5). 

A  two-way  analysis  of  variance  was  performed  on  the  error  data  to  de- 
termine the  reliability  of  the  differences  between  means  over  compressions 
and  over  blocks  of  fifty  words  as  well  as  to  determine  whether  there  was 
an  interaction  between  amount  of  compression  and  the  ordinal  number  of 
blocks  of  fifty  words.   The  analysis  involved  independent  measures  with 


48 


Figure  3.   Reaction  time  and  number  of  errors  as  a  function  of  amount  of 
practice  for  all  compression  groups  combined. 


O       en 


MEAN    NUMBER  OF  ERRORS 

o  01 


o 

v*  1                         1 

1 

_ 

»           0 

m 

•j} 

/                             / 

ro 

30              j 
0            / 

S                          /° 

33              / 

/ 

to.         / 

y 

w 

1-  CM 

\/ 

^ 

o 
o 

^"       / 

\ 

CO  ^ 

0^     / 

\        r 

JO 

o 
-n 

\     / 

m 
> 

\  / 

0 

-n  en 
-n 

A 

H 
O 

H 

y 1 

Z 

-< 

0> 

7  ° 

H 

^ 

/  \ 

S 

O 

/    \ 

m 

20 

CO 

/  /0 

/  / 

Is 

00 

\\ 

<o 

-                           ok 

,    A 1       1       1 

■           1           1 

0 

—v* 1 — *oJ 1 1 

1           I           1 

CM 

ro 
O 

CM             CM             CM             CM 
4k             CJI             0>             "^J 
O             O             O             O 

CM             CM 
OD            tf> 
O            O 

MEAN 

OF   MEDIAN    REACTION 
(MILLISECONDS) 

TIMES 

49 


respect  to  compression  and  repeated  measures  over  the  ten  blocks  of  fifty 
words.   The  results  of  the  analysis,  a  weighted-means  one  (Winer,  1971) 
in  view  of  unequal  group  sizes,  are  shown  in  Table  1.   The  critical  region 

Table  1 
Analysis  of  Variance  for  Errors 
Source  df  MS  F 

Between  subjects 
Compression  (C) 

Subjects  within  groups 
Within  subjects 
Ordinal  block  (0) 
C  x  0 

0  x  subjects  within  groups 
*£<.05 

adopted  for  statistical  significance,  both  for  this  analysis  and  for  a 
succeeding  one,  corresponds  to  the  .05  level.   As  can  be  seen  in  Table  1, 
the  main  effect  of  compression  and  the  main  effect  of  ordinal  block  were 
both  statistically  significant,  while  the  interaction  between  these  two 
variables  was  not  significant. 

Reaction  time.   Since  the  distribution  of  reaction  times  was  positively 
skewed,  the  basic  datum  used  was  the  median  reaction  time  over  a  set  of  50 
words.   Both  incorrect  and  correct  responses  were  included  when  it  was  deter- 
mined that  the  exclusion  of  reaction  times  to  words  incorrectly  identified 


50 


h 

1391.91 

11.81* 

2k 

117.89 

9 

138.19 

15.51* 

36 

2.92 

<  1 

216 

8.91 

produced  virtually   the  same    results.      The  mean   value  of  the  medians  over 
all   500  words  was   determined  for  each   compression   value  separately.      These 
means   for  groups  with  percentage  of  original    signal    remaining  of   100,   41, 
32,  29,   and  25  were,    respectively,   301,   343,   367,   356,   and  414  ms.     Thus, 
as   amount  of  compression    increased,    the  mean  of  the  median    reaction   times 
generally    increased.      This    result    is   presented  graphically  as   the  broken 
line    in   Figure  2.      It    is  more  nearly    like   the    results  of  Meeker,   Stevens, 
and  Williams    (1966)    than    like   those  of  Shriner  and  Sprague    (1969)    in   that 
except   for  one   small    inversion,    reaction   time    increased  with   greater  com- 
pression.     The    inversion   that   did  occur  did  not,   as    in   Shriner  and  Sprague's 
(I969)    study,   occur   in   the    region  where  errors  were  still    minimal,   and 
in   fact    in   that    region    reaction   time  went   up  sharply    in   the  present   study. 

Parallel    to  the  question  of  the   functional    relation   between  errors 
and  the   prior  number  of  sets  of  words    listened   to   is   the    relation   between 
reaction   time   and  the  prior  number  of  sets.      A  decrease    in    reaction   time  with 
practice  might  be   taken,   along  with   a  decrease    in  errors,    to   indicate   a 
learning  effect.      The  mean  of  the  median    reaction   times   for  each  block  of 
50  words  over  all    values  of  compression,   beginning  with   the   first  block 
and  ending  with   the  tenth  block,  were   390,   385,   371,   343,   350,   348,   350, 
338,    340,   and  337  ms.      Thus,   as  number  of  words    listened   to   increased, 
reaction   time  generally   decreased.      This    result    is   presented  graphically 
as   the  broken    line    in   Figure   3. 

A  two-way  analysis   of  variance  was   performed  on   the   reaction   time   data 
similar  to  that   for  the  error  data.      The    results  of  the   analysis   are  shown 
in  Table  2.      As   can  be  seen    in   this   table,   the  main  effect  of  compression 


51 


Table  2 

Analysis  of  Variance   for  Reaction  Time 

Source  df_                               MS                                  F 
Between  subjects 

Compression    (C)  k                       92,167.70                          I.30 

Subjects  within  groups  2k                       70,751.99 
Within  subjects 

Ordinal   block   (0)  9                        10,979.73 

C  x  0  36                               305.*^                           <  I 

0  x  subjects  within  groups  216                             1201.09 
*R405 


9.1*t* 


was  not  significant,  the  main  effect  of  ordinal  block  was  significant,  and 
the  interaction  between  these  two  variables  was  not  significant.   Inspection 
of  the  reaction  time  function  in  Figure  2  revealed  that  the  values  of  mean 
reaction  time  for  three  of  the  compression  values  were  very  close  together. 
These  values  were  3^3,  367,  and  356  ms. .respect ively,  for  percentages  of 
original  recording  time  of  k\ ,    32,  and  29.   It  seemed  likely  that  this  fact 
accounted  for  the  nonslgnif icance  of  a  relation  that  otherwise  appeared  to 


52 


be  significant.  Therefore,  an  analysis  was  done  to  determine  the  sig- 
nificance of  the  difference  between  the  lowest  and  highest  data  points, 
viz.,  those  for  25  per  cent  and  100  per  cent  of  original  recording  time. 
The  studentized  range  statistic  (V/iner,  1971)  was  used,  which  showed  that 
reaction  time  increased  significantly  between  these  two  points  (q=9.^3). 

Discussion 
The  results  with  respect  to  the  mean  number  of  errors  made  as  a 
function  of  amount  of  time  compression  supported  the  results  of  prior 
studies  (e.g.,  Garvey,  1953;  Foulke  and  Sticht,  1967b).   This  result  to- 
gether with  the  apparent  learning  effect  over  blocks  of  50  wordsare,  of 
course,  not  of  primary  interest  in  the  present  study.   Rather,  interest 
lies  primarily  in  the  fact  that  reaction  time  increased  as  amount  of  com- 
pression increased,  for  this  result  supports  the  position  taken  by  Foulke 
and  Sticht  (1967a)  that  the  decrement  in  comprehension  of  connected  speech 
as  amount  of  compression  increases  is  at  least  in  part  a  function  of  the 
increasingly  limited  time  available  to  the  listener  to  process  speech  in- 
formation.  It  would  seem  to  follow  that  longer  reaction  times  to  separate 
words  would,  In  the  case  of  connected  speech,  result  in  inadequate  processing 
of  some  words  due  to  the  fact  that  the  listener  would  be  faced  with  the  task 
of  having  to  process  a  number  of  words  within  a  short  time  period,  as  de- 
scribed at  the  beginning  of  this  paper.   Of  course,  some  caution  must  be  used 
In  drawing  the  conclusion  that  reaction  time  increases  with  increasing  com- 
pression in  view  of  the  fact  that  the  f_  comparing  all  five  groups  was  not 
significant.   However,  this  outcome  appears  to  have  been  a  function  of  the 
values  of  compression  that  were  chosen,  as  discussed  In  the  Results  section. 

53 


As  for  the  relationship  between  the  present  reaction  time  results 
and  the  results  of  previous  research,  the  most  meaningful  comparison  can 
perhaps  be  made  with  the  study  by  Shriner  and  Sprague  ( 1 969) .   It  will 
be  recalled,  however,  that  they  found  reaction  times  shortest  at  an  inter- 
mediate value  of  compression.   As  they  pointed  out,  there  was  very  little  in- 
crease in  number  of  errors  as  compression  was  increased  to  that  Intermed- 
iate value.   This  result  is  unlike  that  of  the  present  study  where,  as 
shown  in  Figure  2,  there  was  a  large  increase  in  reaction  time  between 
100  per  cent  and  59  per  cent  of  original  recording  time,  a  region  within 
which  there  was  very  little  increase  in  number  of  errors.   This  large  in- 
crease in  reaction  time  accompanying  a  small  increase  in  errors  is  somewhat 
reminiscent  of  prior  research  comparing  comprehension  and  intelligibility 
functions  (Foulke  and  Sticht,  1967b) ,  in  which  it  was  found  that  with  in- 
creasing compression,  comprehension  begins  to  decline  rapidly  sooner  than 
intelligibility  does.   The  reaction  time  function  in  Figure  2  could  well 
pass  for  a  comprehension  function,  giving  added  support  to  the  position 
that  comprehension  is  influenced  importantly  by  the  time  pressure  described 
earl  ier. 

There  were  many  differences  between  the  Shriner  and  Sprague  (1969)  study 
and  the  present  one  with  respect  to  materials,  procedures,  and  subjects, 
and  therefore  it  is  impossible  at  present  to  determine  why  the  reaction  time 
results  are  somewhat  discrepant.   Of  course  the  results  of  the  two  studies 
are  similar  in  important  respects.   There  seems  little  doubt  that  each  study 
leads  to  the  conclusion  that  with  sufficient  time  compression,  reaction  time 
in  identifying  isolated  words  is  lengthened  substantially. 


54 


It    is  of  some    interest   to  draw  a  parallel    between   the  point   of 
view  we  have  adopted  and  a  summary  statement    related   to  an   analogous 
line  of  research   in  short-term  memory.     According  to  Aaronson    (1967), 
"   ...   experimental   evidence   indicates   that  perceptual    processes  continue 
to  occur  after  the  physical   stimulus   presentat ion--either  auditory  or 
visual — is   terminated.      Interference  with  or  termination  of  these  post- 
presentation  perceptual   processes   can    lead  to  decreased   recall    accuracy 
in   short-term  memory   tasks    (p.    136)." 

Finally,   the  present    results  v/ould  appear  to  be   closely    related  to 
the   results   of  a  study  by  Overmann    (1971)   where  pieces  of   leader  tape  were 
spliced   into  compressed  tapes  between   phrases   and  sentences  of  connected 
speech,   thus    restoring  "unfilled  time"   that   had  been    reduced  by   the  process 
of  compression.      The    result  of  such  time   restoration  was   to   increase  com- 
prehension significantly  at   two  of  three    levels  of  time  compression.      These 
increases  were  not    large,  nor  did  time   restoration   keep   comprehension   from 
declining  just   as    rapidly  when   compression  was    increased  as    it   did  without 
time    restoration.      However,    the  distribution   of   restored  time  was  made    in 
a  very  simple  way,   and  as   the  author  suggests,   other  ways  of  distributing 
the   time  might    result    in   the  maintenance  of  comprehension   at   a  higher   level. 
The   results  of  Overmann's    (1971)    study  together  with   those  of  the  Shriner 
and  Sprague    (I969)    and   the  present   study  certainly  suggest   that   attempts   to 
establish   adequate   rules   for  such    restoration  of  time  might  well   be  worth 
the  effort   both   for  theoretical    and   for  practical    reasons.      From  a  theoret- 
ical  point  of  view,   the   role  of  processing  time  could  be   investigated  further, 

and  from  a  practical   point  of  view,  means  of  substantially    increasing  the 
comprehension  of  compressed  speech  might   be  discovered. 

55 


REFERENCES 

Aaronson,   D.     Temporal    factors    ?n  perception   and  short-term  memory. 
Psychological   Bulletin,    1967,  67,    130-144. 

Egan,   J.    P.      Articulation   testing  methods.      Laryngoscope,    1948,   58,   955-991. 

Foulke,   E.,   and  Sticht,  T.      A  review  of   research  on   time- compressed  speech. 
In   Foulke,   E.    (Ed.)  ,   Proceedings  of  the   Louisvi 1 1e  conference  on 
time- compressed  speech.      Louisville:   University  of  Louisville,    1 967.      (a) 

Foulke,   E.,   and  Sticht,   T.      The    intelligibility  and  comprehension  of  time- 
compressed  speech.      In   Foulke,   E.    (Ed.),   Proceedings  of  the  Louisville 
conference  on   t ? me- comp ressed  speech .      Louisville:   University  of   Louisvi  lit 
1967.      (b) 

Garvey,  W.    D.      The    intelligibility  of  speeded  speech.      Journal   of  Experimental 
Psychology,    1953,  45,    102-108. 

Hecker,   M.    H.    L. ,   Stevens,   K.    N.,   and  Williams,   C.    E.      Measurements  of   reaction 

time    in    intelligibility   tests.      Journal   of  the  Acoustical    Society  of  America, 
1966,  39,   1188-1189. 

Overmann,    R.   A.      Processing  time  as   a  variable    in   the  comprehension  of 

time-compressed  speech.      In   Foulke,   E.    (Ed.),   Proceedings  of  the  second 
Louisvi  1  le   conference  on    rate  and/or  frequency  control  led  speech. 
Louisville:   University  of  Louisville,    1971. 

Shriner,   T.    H.  ,   Beasley,   D.    S. ,   and  Zemlin,  W.    R.      The  effects  of  frequency 
division   on   speech    identification    In   children.      Journal   of  Speech  and 
Hearing   Research,    1969,    12,   413-422. 

Shriner,   T.    H. ,   and  Sprague,   R.    L.      Effects  of  time- compressed  speech   signals 
on   children's    identification   accuracy  and   latency  measures.      Journal 
of  Experimental    Child  Psychology,    1969,   7,   532-540. 
56 


Winer,  B.  J.   Statistical  principles  in  experimental  design  (2nd  ed.). 
New  York:  McGraw-Hill,  1971. 


Footnotes 
*Dr.  Hughes  is  Associate  Professor  of  Instructional  Systems  Technology 
and  Research  Associate  in  the  Audio-Visual  Center,  Indiana  University, 
Bloomington,  Indiana  ^7^01. 

**Dr.  Foulke  is  Professor  of  Psychology  and  Director  of  the  Perceptual 
Alternatives  Laboratory,  University  of  Louisville,  Louisville,  Kentucky 
40208. 


57 


The  Wichita  Studies  on  Comprehension  of  Rate- Controlled  Speech 
by  McCroskey,    R.TL.  ,    &  Nelson,    N.  W. 


59 


Abstract 


THE  WICHITA  STUDIES  ON  COMPREHENSION  OF 
RATE- CONTROLLED  SPEECH 

Robert  L.    McCroskey,    Ph.D.       Nickola  W.    Nelson,   Ph.D. 

The  purpose  was  to  deliniate  some  effects  of  temporal  manipulation 
of  spoken  messages  upon  auditory  comprehension  by  various  groups  of 
children.     Research  quality  equipment  and  appropriate  controls  were  used 
in  all  phases.     Subject  response  methods  include  multiple- choice  picture 
selection  or  spoken  imitation.     The  results  indicated  that: 

When  language  impaired  children  (N  =  20;  age  5  through  17  years) 
listened  to  simple-active-affirmative-declarative  sentences  (SAAD), 
rate  of  speaking  was  a  significant  factor  for  the  younger  subjects. 

When  mentally  retarded  children  (N  =  23;  age  7  through  10  years) 
listened  to  SAAD  sentences,    comprehension  was  better  under  con- 
ditions of  expansion  than  under  compression. 

When  normal  first  graders   (N  =  20;  age  6  through  7  years)  re- 
sponded to  SAAD  sentences,   there  was  no  difference  in  compre- 
hension under  conditions  of  compression  or  expansion;  those 
children  within  a  normal  classroom  who  had  been  referred  for 
special  reading  did  show  significant  differences  in  comprehension 
according  to  rate  of  speaking. 

When  normal  children  (N  =  360;  ages  6  through  9  years)  listened 
to  sentences  where  both  difficulty  and  rate  of  speaking  were  varied, 
comprehension  was  influenced  by  rate,   difficulty  and  age. 

When  reading  disordered  children  (N  =  60;  age  7  through  9  years) 
listened  to  sentences  of  varying  difficulty  at  different  rates  of 
speaking,    their  comprehension  was  poorer  at  all  levels  of  sentence 
difficulty  and  was  more  influenced  by  variation  in  speaking  rate 
than  was  true  for  the  control  subjects. 

When  Black  English  speaking  children  (N  =  80;  age  6  through  9 
years)  listened  to  sentences  which  varied  in  difficulty  and  in  rate, 
comprehension  skills  were  more  sensitive  to  rate -controlled  input 
than  their  standard  English  counterparts. 

When  normal  first  graders  (N  =  20;  ages  6  and  7  years)  listened 
to  rate-controlled  nonsense  syllables,    imitative  skills  were 
markedly  improved  under  expanded  conditions. 

Discussion  centers  around  commonalities  in  auditory  comprehension 
under  conditions  of  temporal  manipulation  by  several  groups  of  school  age 
children.     The  general  conclusion  is  that  comprehension  is  facilitated  when 
rate  of  speaking  is  expanded  and  that  there  are  strong  implications  for 
preparation  of  educational  materials. 

60 


THE  WICHITA  STUDIES  ON  COMPREHENSION  OF  RATE  CONTROLLED  SPEECH 
Robert  L.  McCroskey  and  Nickola  W.  Nelson* 

This  is  a  report  on  a  series  of  investigations  conducted  at 
Wichita  State  University  for  the  purpose  of  studying  some  of  the 
effects  of  rate-altered  speech  upon  auditory  comprehension  by  normal 
children  and  by  children  having  various  learning  problems. 

In  general,  it  is  hypothesized  that  many  children  exhibiting 
learning  problems  are  children  whose  auditory  systems  do  not  process 
language  with  sufficient  rapidity.  On  the  basis  of  previous  work 
(Efron,  1963;  Lowe  §  Campbell,  1965;  and  Eisenson,  1968)  it  appears 
that  both  adult  and  childhood  aphasia  are  associated  with  reduced 
ability  to  perform  temporal  tasks  involving  succession  and  order. 
These  data  combine  with  the  work  of  Stroud  (1967),  indicating  that 
even  children  with  alleged  functional  misarticulations  do  not  com- 
prehend rapid  speech  as  well  as  normal -articulating  children,  and 
with  the  common  observations  that  adult  speech  seems  to  be  accomo- 
dated to  the  adult's  intuitive  feeling  about  the  rate  at  which 
children  at  different  age  levels  can  comprehend  speech  (Broen,  1972; 
Fraser,  1975);  thus,  simplified  sentence  structure  and  much  slower 


*Dr.  Robert  L.  McCroskey  is  Professor  and  Dr.  Nickola  W.  Nelson  is 
Assistant  Professor  in  the  Department  of  Logopedics,  Wichita  State 
University,  Wichita,  Kansas  67208. 


61 


speech  is  used  with  nursery  school  children  than  would  be  true  for 
elementary  school  children  and  higher  age  levels.  It  seems  clear 
that  rate  of  speaking  has  an  influence  on  auditory  comprehension 
by  individuals  of  different  ages  and  with  different  problems,  the 
only  question  is  one  of  specifying  the  effects  of  rate  of  speaking 
within  these  various  categories. 

Brief  descriptions  of  seven  investigations  utilizing  rate  al- 
tered speech  are  included.  The  specific  subject  categories  include: 
(a)  language  impaired  children,  (b)  mentally  retarded  children,  (c) 
normal  first -graders,  (d)  normative  data  for  the  primary  grades, 
(e)  reading  disordered  children,  and  (f)  Black  English  speaking 
children.  Each  of  the  seven  studies  was  conducted  independently 
but  the  basic  procedure  was  common  throughout. 

The  method  of  rate  alteration  was  the  same  for  all  experiments. 
The  original  recordings  for  all  conditions  were  made  either  on  a 
Magnacorder  Model  1022  or  a  Nagra  IV-D,  with  the  restriction  that 
only  one  of  these  recorders  was  used  in  any  given  experiment.  All 
recordings  were  made  in  a  double-wall  IAC  booth  using  a  Shure  Model  560 
dynamic  microphone  by  a  female  speaker.  Where  conversion  to  a  tape 
speed  of  15  ips  was  required,  the  second  copy  was  made  on  an  Ampex  350 
recorder.  All  rate  variations  were  accomplished  by  using  a  Rate 
Changer,  manufactured  by  the  Eltro  Company  of  Heidleburg,  Germany. 
This  device  controls  rate  of  speaking  through  periodic  sampling  of 
the  speech  signal  and  subsequent  abutting  or  reproducing  of  these 
extremely  brief  samples,  thus  accomplishing  alteration  of  word  rate 


62 


with  negligible  pitch  distortion.  Spectrograph^  analysis  of  rate 
altered  signals  indicated  a  sampling  interval  of  approximately  .03 
seconds. 

All  subjects  listened  to  the  spoken  samples  under  earphones 
(TDH-39  with  MX  41/AR  cushions) ,  with  the  exception  of  those  who 
participated  in  the  normative  study,  where  matched  KOSS  (Type  ESP -6) 
electrostatic  headphones  were  used. 

All  children  participating  in  these  studies  were  determined  to 

have  normal  or  corrected  visual  acuity  and  normal  hearing  as  indicated 

by  a  pure  tone  screening  test  at  25dB  HL  for  the  frequencies  500, 

1000,  2000  and  4000  Hz. 

Experiment  I 
Language  Impaired 

The  subjects  were  20  children  between  5  and  17  years  of  age 
whose  expressive  and  receptive  language  skills  were  sufficiently 
deviant  as  to  preclude  their  participation  in  public  school  pro- 
grams. All  were  enrolled  in  the  Institute  of  Logopedics,  Wichita, 
Kansas. 

The  Test  of  Auditory  Verbal  Comprehension  was  comprised  of  50 
simple  declarative  sentences.  Sentence  complexity  was  held  constant 
across  all  rates  and  no  vocabulary  item  above  a  5 -year  level  was 
included.  All  sentences  were  read  by  a  female  speaker  at  a  rate  of 
3.6  syllables  per  second  (sps),  which  was  operationally  defined  as 
normal  rate  of  speaking  for  this  type  of  material  and  for  subjects 
of  this  age  range  (Stroud,  1967).  The  original  recording  of  50 


63 


sentences  was  copied  four  times  and  the  resulting  200  sentences 
were  separated  and  spliced  into  four  different  random  orders  in  an 
effort  to  eliminate  any  effect  resulting  from  sentence  order  or 
subject  fatigue.  The  20  subjects  were  divided  into  four  groups. 
Each  group  heard  in  random  order  5  sets  of  10  sentences  with  each 
set  presented  at  one  of  five  different  rates  of  speaking. 

Five  experimental  rates  of  speech  were  selected:  (1)  normal  rate, 
(2)  two  rates  slower  than  normal  and  (3)  two  rates  faster  than  normal. 
The  limits  of  either  expansion  or  compression  were  dictated  to  some 
extent  by  the  degree  to  which  the  equipment  would  function  without 
introducing  undue  perceptual  distortion.  A  midpoint  between  the 
extremes  available  was  selected  and  adjusted  so  that  the  rate  of 
ongoing  speech  was  3.6  syllables  per  second  and  this  was  designated 
on  a  percentage  scale  as  100%.  Thus,  rate  1  was  60%  or  a  rate  of 
6.8  sps;  rate  2  was  80%  or  a  rate  of  5.0  sps;  rate  3  was  100%  or 
3.6  sps;  rate  4  was  140%  or  2.9  sps;  and  rate  5  was  180%  or  2.3  sps. 

The  subject's  task  was  to  select  the  correct  response  from  three 
alternatives  after  hearing  the  tape-recorded  stimulus  sentence  via 
headphones.  The  three,  multiple -choice  response  pictures  were  pre- 
sented on  35  mm  slides  via  a  back-projection  screen  below  which  were 
mounted  three  response  buttons  for  the  subject  to  indicate  his  selec- 
tions . 

Analysis  of  variance  for  the  younger  subjects  indicated  that 
comprehension  was  significantly  affected  as  a  function  of  rate  of 
speaking  (F  =  2.91  with  4  and  36  df).  The  presence  of  significance 


64 


for  an  N  of  10  suggests  that  a  true  difference  is  likely  to  exist, 
since  the  usual  effect  of  computing  an  analysis  with  a  small  sample 
is  to  reduce  the  possibility  of  obtaining  a  significant  F.  Signifi- 
cance was  not  achieved  for  the  10  oldest  subjects  of  the  sample. 
The  comparison  of  performance  of  the  younger  listeners  with  the 
total  group  is  shown  in  Figure  1. 


It  is  accepted  that  high  probability  words  are  recognizable  with 
shorter  exposure  time  pforton  5  Broadbent,  1967)  and  in  the  present 
study  only  high  probability  and  high  frequency  words  were  included. 
This  may  have  created  a  situation  in  which  the  stimulus  material -- 
in  terms  of  linguistic  structure  and  vocabulary- -was  too  easy  for 
the  older  children.  The  possible  inter -relationship  of  comprehension 
with  both  linguistic  structure  and  frequency  of  occurrence  has  been 
reported  previously  by  Carrow  (1968) .  Further  analysis  of  rate- 
effects  indicated  that  speech  stimuli  delivered  at  2.9  sps  were  more 
easily  comprehended  than  stimuli  delivered  at  5.0  sps  and  faster,  or 
at  2.3  sps. 

Experiment  II 
Mentally  Retarded 

The  subjects  were  23  children  between  7  and  14  years  of  age 

enrolled  in  the  Starkey  Developmental  Center  for  the  Retarded  (Wichita) 

The  IQ  range  was  from  23  to  63  with  a  mean  of  44.5  and  a  median  of  46. 

The  testing  instrument  was  the  Stanford-Binet,  except  for  two  children 


65 


1.   A  comparison  of  auditory  comprehension  by  two  groups  of  language 
impaired  children  under  five  conditions  of  rate  alteration. 


100  - 


90 


£ 


0—-o  TOTAL  GROUP 

\ 

i 

4/ 

6,8    5.0    3.6 

FAST 

RATE  (SPS) 

2.9 

2.3 

SLOW 

66 


who  were  tested  using  the  WISC  and  two  using  the  Leiter  International 
Scale. 

The  test  of  Auditory  Verbal  Comprehension  was  the  same  as  that 
described  in  Experiment  I.  The  subject's  task  was  also  the  same, 
except  that  the  response  was  varied  by  having  these  subjects  touch 
the  picture  directly. 

Analysis  of  variance  for  the  total  group,  and  for  the  younger 
versus  the  older  children,  failed  to  yield  statistically  significant 
differences  as  far  as  effects  of  rate  upon  auditory  comprehension 
are  concerned.  It  is  interesting,  however,  to  note  the  generally 
lower  performance  and  greater  variability  in  performance  by  the 
mentally  retarded  (MR)  youngsters  (See  Figure  2) . 


While  the  MR  children  and  the  language  disordered  children  per- 
formed essentially  alike  at  the  faster  rates  of  speaking,  there 
was  a  divergence  as  the  rate  decreased  through  normal  to  a  s lower - 
than-normal  rate.  It  is  also  of  interest  to  note  opposite  effects 
resulting  from  exposure  to  the  most  expanded  condition.  Normal 
children  and  language  disordered  children  experienced  decrements  in  audi- 
tory comprehension  at  1801  expansion  while  MR  children  achieved  their 
best  score  at  this  level  of  expansion.  Preliminary  data  from  a  study 
now  in  progress1  indicate  that  auditory  comprehension  for  mentally 


■""A  master's  thesis  now  in  progress  by  Tom  Schroder,  Wichita  State  University. 


67 


Auditory  comprehension  of  simple  active  affirmative  declarative 
sentences  by  mentally  retarded  children  under  five  conditions 
of  rate  altered  speech. 


6.8 

FAST 


5.0    3.6    2.9 

RATE  (SPS) 


2.3 

SLOW 


100  - 


NORMAL 


68 


6.8 

FAST 


5.0 


3.6 


2.9   2.3 


RATE  (SPS)         SL0W 

Auditory  comprehension  of  simple  active  affirmative  declarative 
sentences  by  normal  children  under  conditions  of  rate  altered 
speech  (including  children  recommended  for  special  reading) . 


retarded  subjects  is  best  at  200%  expansion  and  that  some  experience 
a  decrement  only  at  2251.  This  was  accomplished  by  feeding  expanded 
stimuli  through  the  rate-changer  a  second  time.  The  product  of  double 
expansion  is  perceptually  clear. 

Experiment  III 
Normal  Children 

The  subjects  were  20  children  who  were  either  6  or  7  years  of 
age  and  enrolled  in  a  regular  first  grade  classroom  in  the  Wichita 
Public  School  System.  The  test  of  auditory  comprehension  was  the 
same  as  that  described  in  previous  experiments  and  the  subject's  task 
remained  the  same;  however,  in  this  instance  a  group  administration 
was  used.  The  multiple-choice  slides  were  projected  onto  a  screen 
easily  visible  to  all  children  and  the  stimulus  sentences  were  delivered 
free  field  at  an  intensity  easily  perceived  by  all  children  (70-80dB  SPL) . 
Each  child  marked  his  response  on  a  specially  prepared  answer  sheet  by 
circling  one  of  three  numbers  which  corresponded  with  the  projected 
alternatives . 

It  had  been  hypothesized  that  children  enrolled  in  a  regular  first 
grade  would  perform  equally  well  regardless  of  the  rate  at  which  speech 
was  delivered  and  that  there  would  be  virtually  errorless  performance. 
Analysis  revealed  a  statistically  significant  difference  according  to 
rate  of  speaking.  Several  explanations  may  be  offered  for  this  un- 


69 


expected  result.  For  example,  recent  work  in  psycholinguistics 
suggests  that,  contrary  to  widely  reported  norms,  children  have  not 
acquired  mastery  of  their  grammer  by  age  five.  Briere  (1969)  and 
Chomsky  (1969)  both  present  evidence  that  this  skill  continues  to 
develop  at  least  until  age  10. 

Perusal  of  the  records  on  these  children  revealed  that  several 
had  been  designated  for  special  reading  the  next  year.  This  prompted 
a  second  analysis  in  which  the  children  with  potential  reading 
problems  were  omitted;  now,  the  performance  was  essentially  a  straight 
line  function  with  all  children  obtaining  scores  between  97  and  991 
accurate . 

The  hypothesis  that  normal   children  would  maintain  extremely 
accurate  auditory  comprehension  of  spoken  sentences  regardless  of 
rate  was  confirmed.  Children  who  were  progressing  normally  with 
respect  to  communication  skills  obviously  did  not  find  the  use  of 
simple  affirmative  sentences  a  very  stringent  test  of  auditory  compre- 
hension. It  was  felt  that  a  more  discriminating  procedure  should 
be  used  in  order  to  look  at  the  time  required  for  a  normal  child  to 
receive  and  process  spoken  events  which  do  not  provide  the  predic- 
tive cues  of  a  syntactical  accurate  sentence.  Therefore,  30  non- 


sense words  which  conformed  to  rules  for  sequencing  sounds  in  English 
were  created.  These  stimulus  words  were  divided  into  three  sets  of 
10  words  each,  with  approximately  equal  numbers  of  monosyllabic, 


This  investigation  was  done  with  Paul  Hagler,  now  speech  clinician, 
Glenrose  Hospital,  Edmonton,  Alberta,  Canada. 


70 


bisyllabic  and  trisyllabic  constructions  in  each  list.  In  this 
phase  of  the  work, only  three  rates  of  speaking  were  employed:  (a) 
a  normal  rate  of  3.6  sps;  (b)  an  accelerated  rate  of  6.8  sps,  and 
(c)  a  reduced  rate  of  2.9  sps.  Rate  order  was  rotated  among  the 
tapes  so  that  all  subjects  heard  all  of  the  stimulus  words  at  all 
three  rates  of  speaking.  For  this  aspect  of  the  investigation, 
subjects  heard  the  stimuli  at  a  signal  intensity  of  70dB  SPL  under 
matched  KOSS  (ESP-6)  headphones.  The  subject's  task  was  to  repeat 
aloud  the  word  he  heard;  his  utterance  was  transcribed  in  International 
Phonetic  Alphabet  and  simultaneously  recorded  for  later  verification 
and  scoring.  The  criterion  measure  was  the  number  of  correct  phonemes 
and  syllables  repeated  by  each  subject. 

The  results  indicated  that  reduced  rate  of  speaking  facilitated 
the  auditory  perception  of  nonsense  words  and  that  even  the  expanded 
rate- -which  yielded  the  highest  percent  correct  repetitions- -taxed 
the  auditory  perceptual  skills  of  normal  children.  The  relative 
performance  with  sentences  and  with  nonsense  words  is  shown  in 
Figure  4.    The  mean  correct  repetition  at  the  slowest  rate  of 


speaking  was  63.81.  At  normal  rate  of  speaking  the  accuracy  of 
repetition  dropped  to  57.7%  and  at  the  more  rapid  rate  of  speaking 
the  listeners  achieved  only  32.21  correct  imitation.  These  results 
are  in  contrast  to  the  near -perfect  scores  achieved  by  these  same 


71 


4.  The  relative  auditory  comprehension  of  simple  sentences  and 
nonsense  words  by  normal  first,  graders  under  three  rates  of 
speaking . 


100 
90  [ 


P  70 

P  50 
S  40 

30 


o — 


-O  SENTENCES 

"•  NONSENSE  WORDS 


6.8 
FAST 

60% 


3.6 

NORMAL 
100% 
RATE 


2.9 
SLOW 


72 


subjects  when  the  stimuli  gave  the  added  syntactic  and  prosodic 
cues.  The  implications  for  what  might  be  appropriate  rates  of 
speaking  to  facilitate  initial  learning  of  language  and  speech  pro- 
vides an  interesting  area  of  speculation. 

Experiment  V 
Normative  Data 

The  subjects  were  360  normal  children  selected  from  seven  ran- 
domly chosen  elementary  schools  in  the  city  of  Wichita,  Kansas.  One 
experimental  classroom  per  grade  level  per  school  was  chosen.  Nor- 
mality was  defined  in  terms  of  (a)  audition  and  vision;  and  (b)  the 
absence  of  special  problems  involving  speech,  reading,  or  emotional 
behavior. 

In  this  investigation,  four  variations  of  rate-altered  speech 
were  employed.  In  order  to  meet  the  requirement  of  four  equivalent 
sets  of  sentences  which  were  graduated  in  difficulty,  ten  sentence - 
pairs  involving  no  transformational  grammatical  contrast  were  selected 
from  those  used  by  Lee  (1971)  in  the  comprehension  portion  of  the 
NSST.  Ten  additional  matching  sentence -pairs  were  constructed  to 
illustrate  the  same  grammatical  features.  The  four  rates  of  speaking, 
in  syllables  per  second,  were  5.0,  4.2,  3.4,  and  2.6.  The  four  sets 
of  10  sentences  each  were  presented  at  each  of  the  four  experimental 
rates  and  each  set  began  with  the  easiest  and  ended  with  the  most 
difficult  sentence. 

The  response  method  was  essentially  the  same  as  in  previous 
investigations,  except  that  the  multiple-choices  were  provided  by 


73 


four,  line-drawing  pictures  on  a  single  page  for  each  of  the  stimulus 
sentences.  Simple  practice  sentences  were  provided  before  e*ch  new 
set  of  10  sentences  which  were  to  be  presented  at  a  different  speak- 
ing rate.  Children  recorded  their  decisions  by  marking  an  "X"  on 
the  picture  selected.  Six  subjects,  visually  isolated  from  one 
another,  were  tested  simultaneously  under  six  matched  pairs  of  KOSS 
Electrostatic  headphones. 

The  data  were  analyzed  in  terms  of  comprehension  of  spoken  language 
by  normal  children  as  a  function  of  speaking  rate,  sentence  difficulty 
and  listener  age  and  sex  with  repeated  measures  analysis  of  variance 
for  a  multifactor  experiment  (Winer,  1971).  In  brief,  the  results 
showed  that:  (a)  children  comprehended  language  differently  as  a 
function  of  age  between  5-6  and  9-6  years,  with  the  greatest  difference 
between  any  two  successive  age  levels  occurring  at  the  lower  end  of 
the  age  scale;  (b)  normal  children  appeared  still  to  be  developing 
facility  with  the  comprehension  of  syntactic  structures  beyond  the 
age  of  5  years,  as  evidenced  by  significant  age-by-difficulty  inter- 
actions; (c)  speaking  rate  interacted  with  sentence  difficulty  and 
listener  age  to  affect  differentially  the  ability  of  normal  children 
to  comprehend  spoken  language;  and  (d)  comprehension  showed  some 
slight  improvement  as  speaking  rate  was  slowed  down,  even  for  normal 
children  included  in  this  study. 

Experiment  VI 
Reading  Disordered  Children 

The  subjects  were  120  children  equally  divided  between  experimental 

and  control  groups  with  20  subjects  at  each  of  the  age  levels  7,8,  and 


74 


9  years.  The  controls  were  randomly  selected  from  first,  second  and 
third  grades  in  geographically  distributed  schools  in  the  city  of 
Wichita.  The  same  subject  selection  criteria  identified  in  the  pre- 
vious experiment  applied  here.  Experimental  subjects  were  enrolled 
in  special  reading  programs  in  the  public  schools. 

The  stimuli,  method  of  presentation,  and  response  method  were 
the  same  for  this  investigation  as  that  described  in  Experiment  V. 
For  purposes  of  analysis,  the  sentence -types  were  grouped  into  two 
levels  of  difficulty- -labeled  More  Difficult  and  Less  Difficult. 

With  all  the  variables  considered- -including  two  experimental 
groups,  three  listener  ages,  and  two  levels  of  sentence  difficulty- - 
a  2  x  3  x  2  repeated  measure  analysis  of  variance  design  (Winer,  1971) 
was  utilized.  While  the  order  of  difficulty  of  the  10  sentences  in  the 
comprehension  portion  of  the  NSST  had  been  empirically  determined 
on  344  children,  the  120  children  in  this  investigation  responded  to 
two  of  the  sentence  types  in  a  distinctly  different  manner.  The 
sentence  pair  "The  milk  spills"  versus  "The  milk  spilled"  yielded 
such  a  high  error  rate  that  they  were  discarded  for  purposes  of 
analysis,  leaving  four  test  sentences  at  each  rate -by- complexity 
combination  for  a  total  of  32  sentences  heard  by  each  of  the  120 
subjects --3, 840  sentences  in  all. 

The  results  indicate  that  children  in  special  reading  programs 
were  less  proficient  in  comprehending  spoken  language  regardless  of 
age,  although  the  trend  for  both  groups  was  to  improve  with  age 
(See  Figure  5).  Differential  effects  of  speaking  rate  upon  compre- 


75 


The  relative  performance  of  normal  and  reading  disordered  children 
on  an  auditory  comprehension  task  according  to  age  of  listener. 


AGE  (YEARS) 


6. 


1.2      3.4 

RATE  (SPS) 

A  comparison  of  auditory  comprehension  by  normal  and  reading 
disordered  children  under  four  conditions  of  rate  altered  speech. 


76 


hension  were  also  found  and  are  illustrated  in  Figure  6. 


A  comparison  of  the  comprehension  of  sentences  varying  with  respect 
to  difficulty  at  different  age  levels  shows  that  children  with  reading 
disorders  perform  less  well  than  their  normal  counterparts.  (See 
Figure  7). 


Listener  sex  was  analyzed  separately  for  both  groups;  no  differ- 
ences were  found  among  any  of  the  conditions  for  either  group.  In 
brief,  rate  of  speaking  had  a  significant  effect  only  for  those  children 
in  special  reading  programs,  with  the  better  comprehension  coming 
at  the  slower  rates  of  speaking  and  with  the  slowest  rate  yielding 
significantly  higher  auditory  comprehension  scores. 

Experiment  VII 
Black  English 

The  subjects  were  80  children  ranging  in  age  from  six  through 

nine  years  who  were  equally  divided  into  two  groups  based  on  whether 

Black  English  constructions  were  in  evidence  during  spontaneous  speech 

or  imitation  of  complex  sentences,  or  if  only  Standard  English  forms 

were  in  evidence  during  the  interviews.  All  other  selection  criteria 

remained  the  same  as  in  Experiment  V. 


77 


7.   The  relative  auditory  comprehension  of  normal  and  reading 
disordered  children  at  three  age  levels  for  spoken  material 
at  two  levels  of  difficulty. 


100 

90 


§8 

S  m  7n 

5£  /U 

85  60 


50 


100 


70 


60 


50 


O O  LESS  DIFFICULT:  NORMAL 

« *  MORE  DIFFICULT:  RDG.  DIS, 


-A  LESS  DIFFICULT:  NORMAL 
-+  MORE  DIFFICULT:  RDG.  DIS, 
J i l 


7      8 

AGE  (YEARS) 


^^^^^ 

^^~i 

A 

tr^'                    o — o  se: 

SLOW 

A^                            A — A  BE: 

SLOW 

• — •   SE: 

FAST 

a— -a  BE: 

FAST 

- 

Y 

i                      i                       i 

I 

7 

AGE  (YEARS) 


8.   The  relative  auditory  comprehension  of  children  using  either 
Standard  English  or  Black  English  according  to  age  and  rate  of 
7  8     speaking . 


All  of  the  experimental  procedures  for  preparation  and  presenta- 
tion of  stimuli  as  well  as  response  method  described  in  Experiment  V 
were  utilized  in  this  investigation. 

The  analysis  of  variance  for  repeated  measures  (Winer,  1971) 
yielded  significant  differences  between  the  experimental  and  control 
groups  (F  =  60.16;  df  =  1,  152;  p<0.01)  with  the  Black  English  speak- 
ing children  experiencing  greater  difficulty  comprehending  Standard 
English  than  was  true  for  Standard  English  speaking  children  (See 
Figure  8).  As  has  been  true  in  all  other  studies  reported  here, 


comprehension  of  both  groups  improved  with  increasing  age  (F  =  40.06; 
df  =  3,  456;  p<0.01).  For  the  Standard  English  speaking  children, 
rate  of  speaking  appeared  to  have  no  influence  on  auditory  comprehen- 
sion; however,  for  Black  English  speaking  children  rate  of  speaking 
affected  their  ability  to  comprehend  Standard  English.  Again,  these 
results  would  appear  to  have  strong  implications  for  classroom 
management  and  give  direction  to  the  type  of  special  assistance  which 
may  be  indicated. 
General  Conclusions 

The  common  purpose  behind  all  investigations  reported  was  to  deter- 
mine some  of  the  effects  of  rate-altered  speech  upon  auditory  compre- 
hension by  normal  children,  by  children  learning  a  different  dialect, 
and  by  children  having  various  learning  problems.  It  appears  that 


79 


children  continue  to  mature,  with  respect  to  their  comprehension  of 
both  simple  and  difficult  syntactic  structures,  at  least  through  age 
nine.  It  is  also  clear  that  the  younger  children  require  more  time  to 
comprehend  and  respond  to  verbal  communication  than  is  true  for  rela- 
tively older  children.  This  tendency  is  markedly  amplified  where 
children  give  evidence  of  learning  problems,  such  as  language  disorders, 
and  reading  disorders,  or  where  they  fall  in  the  general  category  of 
mental  retardation. 

The  advantage  of  slower  rates  of  speaking  was  accentuated  under 
conditions  where  children  had  to  listen  to  nonsense  words  and  could 
not  utilize  a  linguistic  code  to  facilitate  processing  of  the  whole 
acoustic  event.  It  seems  reasonable  to  conjecture  that  initial  learn- 
ing of  words  or  sequences  of  words  may  require  time  constraints  more 
like  those  required  for  processing  nonsense  syllables.  The  implica- 
tions for  teaching  a  first  language  or  instructing  in  a  second 
language  may  be  significant  in  saving  instructional  time  and  yielding 
a  better  end  product.  The  data  on  Black  English  speakers  and  Standard 
English  speakers --indicating  that  children  who  use  Black  English  do  not 
comprehend  Standard  English  as  well  as  users  of  Standard  English--is 
not  surprising,  but  the  data  also  indicate  that  comprehension  of 
Standard  English  is  improved  where  speech  expanded  sentences  are  pre- 
sented. This,  too,  may  have  important  implications  for  teaching 
Standard  English  as  a  second  language.  Overall,  there  is  substantial 
evidence  that  speech  expansion  is  a  worthwhile  area  of  investigation  with 
implications  for  the  instruction  of  children  who  do  not  demonstrate 
facility  with  various  forms  of  communication. 


80 


References 


Briere,  E.  J.  Testing  ESL  skills  among  American  Indian  children. 
In  J.  E.  Alatis  (Ed.)  Monograph  Series  on  Languages  and  Lin- 
guistics, 22:  133-142  (1969). 

Broen,  P.  A.  The  verbal  environment  of  the  language -learning  child. 
ASHA  Monographs,  No.  17,  Washington,  D.C.:  American  Speech  and 
Hearing  Association  (1972) . 

Carrow,  Sister  M.  A.  The  development  of  auditory  comprehension  of 

language  structure  in  children.  J.  Speech  and  Hearing  Disorders. 
33:  99-111  (1968). 

Chomsky,  C.  The  Acquisition  of  Syntax  in  Children  from  5  to  10. 
Cambridge:  The  M.I. T.  Press  (1969). 

Efron,  R.,  Temporal  perception,  aphasia  and  deja  vu.  Brain,  June, 
403-423  (1963). 

Eisenson,  J.  Developmental  aphasia:  A  speculative  view  with  thera- 
peutic implications,  J.  Speech  and  Hearing  Disorders,  33:  1,  3-13 
(1968) . 

Fraser,  C. ,  and  Roberts,  N.  Mother's  speech  to  children  of  four 
different  ages.  J.  of  Psycho linguistic  Research.  4:  (1975). 

Lee,  L.  L.  Northwestern  Syntax  Screening  Test.  Chicago:  The  North - 
wes tern  University  Press  (1969-1971) . 

Lowe,  A.  D.  and  Campbell,  R.  A.  Temporal  discrimination  in  aphasoid 
and  normal  children.  J.  Speech  and  Hearing  Research.  8  (3) , 
313-315  (1965). 

Morton,  J.  and  Broadbent,  D.  E.  Passive  versus  active  recognition 
models  OR  Is  your  homoculus  really  necessary?  In  Wathen-Dunn, 
W.  Models  for  Perception  of  Speech  and  Visual  Form,  103-110, 
Cambridge:  The  M.I.T.  Press  (1967). 

Stroud,  R.  V.  The  comprehension  of  rate -control led  speech  by  second- 
grade  children  with  functional  misarticulations .  In  Foulke,  E. 
(Ed.)  Proceedings  of  the  Second  Louisville  Conference  on  Rate 
and/or  Frequency  Controlled  Speech,  October  22-24,  1969, 

Winer,  B.  J.  Statistical  Principles  in  Experimental  Design,  New  York: 
McGraw-Hill  (1971). 


81 


Theories  of  the  Origins  of  Language 
by  Miron,    M.  S. 


Murray  S.    Miron,    Ph.  D. 

Psycholinguistics 

Psychology  Department 

Huntington  Hall 

Syracuse    University 

Syracuse,    New  York     13210 


83 


Theories  of  the  Origins  of  Language 

At  one  time,  theories  of  how  man  first  began  talking  were  extremely  popular. 
So  many  such  theories  were  advanced  that  at  one  time  they  were  actually  banned 
by  the  French  Linguistic  Society.   Obviously,  all  such  theories  must  be 
speculative,  since  the  necessary  evidence  for  conclusive  demonstration  of  any 
such  theory  is  irretrievably  lost  in  the  earliest  origins  of  man.   Nonetheless, 
such  speculation  can  serve  the  valuable  function  of  directing  our  attention 
to  critical  aspects  of  the  language  function  in  man.   In  this  section,  I 
shall  review  three  theories  which  have  been  suggested  by  others  and  my  own, 
not  entirely  new,  suggestion  as  to  how  it  all  may  have  started.   All  of  these 
theories  are  facetiously  named,  in  order  to  emphasize  their  grossly  specula- 
tive character,  but  bear  in  mind  that  they  do  deal  with  very  real  phenomena  of 
language  use. 

The  ponh-pooh  theory  suggests  that  man's  earliest  language  consisted 
of  reflexive  vocal  responses  to  innately  arrousing  stimuli.   Bitter  substances 
taken  into  the  mouth  characteristically  elicit  spitting  responses  which  are 
adopted  to  expelling  the  noxious  stimulus.   The  noise  which  accompanies  such 
spitting  would  naturally  serve  as  the  sign  of  something  bitter  in  communicating 
a  warning  from  one  man  to  another.   Thus,  spitting  could  come  to  mean;  i.e., 
symbolize,  bitterness,  distaste  and  even  disgust.   Similarly,  startling 
stimuli  reflexively  evoke  a  sharp  intake  or  expulsion  of  breath,  and  thus, 
the  characteristic  noise  associated  with  catching  one's  breath  could  come 
to  stand  for  surprise.   Our  modern  uses  of  language  certainly  display  such 
responses  in  our  expletives.   The  bronx  cheer  is  an  elaborate  exclamation  of 
disgust  which  has  more  than  casual  relationship  to  expelling  distasteful 
substances  from  the  mouth. 


84 


The  bow-wow  theory  suggests  that  the  origins  of  language  are  to  be 
found  in  the  natural  correspondence  between  the  noises  characteristically 
associated  with  certain  objects  or  things  and  man's  attempts  to  imitate 
these  characteristic  sounds.   Thus,  a  man's  imitation  of  a  predator  could 
have  served  as  the  symbol  standing  for  that  animal.   Onomatapoeic  processes, 
our  attempts  to  make  sounds  appropriate  to  the  things  those  sounds  signify, 
are  clearly  present  in  all  languages.   Words  are  coined  and  accepted  in 
large  measure  because  they  sound  appropriate  to  their  meanings.   We  need 
not  only  use  learned  examples,  such  as  Poe's  tintinnabulation.   The  process  is 
at  work  in  such  universally  accepted  words  as  chick  and  kitty.   When  the 
mouth  opening  is  narrowed  to  produce  the  high  front  vowels  of  words  like  pit 
and  Pete,  the  natural  resonant  frequency  of  the  oral  cavity  is  elevated. 
High  pitched  sounds  are  characteristic  of  the  resonant  frequencies  of  small 
things.   Thus,  our  names  for  the  smaller,  younger  versions  of  grown  up 
animals  usually  modify  the  adult  form  of  the  name  be  replacing  the  vowel  with 
one  of  higher  harmonic  frequencies.   (Exceptions  abound  in  such  contrasts  as 
sheep  and  lamb  for  which  the  vowel  change  is  in  the  opposite  direction.   The 
principle  is  hardly  universal.)   Notice  that  the  progression  tiny,  teeny, 
teeny  weensie  seems  to  naturally  fit  our  expectations  of  going  from  small  to 
smallest.   The  vowel  of  teeny  is  produced  with  harmonics  of  higher  frequency 
than  that  of  tiny,  in  fact,  with  the  highest  frequency  of  harmonic  energy  of 
any  vowel  in  English.   Hence,  if  we  want  to  get  smaller,  we  must  repeat  the 
vowel  (with  a  little  shove  from  the  lip  rounding  w) .   Such  effects  in  language 
are  known  as  phonetic  symbolism;  i.e.,  the  symbolic  value  of  sounds  as  sounds. 
They  differ  from  true  onomatopoeia  only  in  the  transparency  of  the  sound 
and  significance  relationship.   It  is  onomatapoeic  and  obvious  that  flimsy 
things  should  rustle;  it  is  purely  imitative  to  say  that  chick's  go  peep; 
and  more  opaque  and  phonetically  symbolic  to  say  that  a  character  named 


85 


Gutch  should  never  be  the  hero  of  a  drama.   But  the  distinction  is  rather 
academic.   All  of  these  examples  represent  a  natural  association  between  the 
sou.id  of  the  name  of  a  thing  and  the  thing  itself.   Attempts  to  widen  the 
application  of  this  onomatopoeic  principle  in  natural  languages  has  led  some 
scholars  to  lengthy  lists  of  words  whose  sound  and  meaning  are  correlated. 
A  linguist  by  the  name  of  Householder  has  suggested,  for  example,  that  all 
r>ords  with  the  vowel  sound  of  but  signify  things  which  are  deficient  in  some 
regard.   Thus  a  mut  is  a  deficient  dog,  a  cut  is  a  deficient  skin,  a  hut  is  a 
deficient  house,  etc.   But  (deficient,  that  is)  a  host  of  counter  examples 
can  always  be  marshalled;  e.g.,  love,  fun,  sun,  etc.   Like  the  lamb,  these 
counter  instances  serve  to  remind  us  that  no  single  principle  is  going  to 
account  for  the  diversity  of  the  forms  meaning  may  take  in  language.  An 
extremely  interesting  experiment  by  Roger  Brown,  however,  demonstrates  the 
power  of  such  sound  correspondances  and  their  potential  universality  in  all 
languages.   Students  from  Harvard  and  Radcliffe  were  asked  to  try  to  guess 
the  translation  of  a  series  of  familiar  English  antonym  pairs  into  Czech, 
Chinese  and  Hindi.   The  English  words  such  as  warm-cool  and  heavy-light  were 
spoken  in  each  of  these  foreign  languages  and  the  subjects  were  asked  to 
match  the  written  equivalents  with  the  English  pairs  on  their  test  sheets. 
None  of  the  subjects  was  familiar  with  any  of  the  languages  used.   Of  21  such 
pairs,  the  subjects  translated  a  range  of  88%  to  33%  of  the  pairs  correctly. 
Overall,  they  correctly  translated  an  average  of  58%  of  the  pairs.   Such 
performance  is  significantly  above  what  one  should  have  expected  by  chance. 
Even  if  one  might  argue  that  students  from  these  two  universities  are 
somewhat  strange,  it  is  difficult  to  understand  such  results,  unless  there 
is  some  sort  of  pan-language  tendency  to  match  sound  with  meaning. 


86 


Unlike  the  other  two  theories  so  far  reviewed,  in  which  the  origins 
of  language  are  assumed  to  lie  in  some  non-arbitrary  and  natural  connection 
between  words  and  their  meanings,  Thorndike  proposed  a  theory  which  assumes  a 
completely  arbitrary  connection.   Dubbed  the  babble-luc  theory,  Thorndike's 
supposition  assumed  that  man  need  only  have  made  random  noises  for  a  lucky 
accident  to  have  established  a  particular  noise  as  the  name  of  a  thing.   Thus, 
if  while  snorting,  grunting  and  otherwise  amusing  himself,  early  man  happened 
by  chance  to  make  a  particular  noise  in  the  presence  of  some  object,  that 
noise,  if  reinforced,  would  come  to  stand  for  the  object  in  exactly  the 
manner  suggested  by  Skinner's  analysis  of  verbal  behavior;  i.e.,  as  a  discrim- 
inated operant. 

But  why  should  man  have  wanted  to  talk  at  all?   It  has  always  struck  me 
as  a  dissatisfying  leap  to  go  from  an  upright  walking  ape  to  a  talking  or  even 
babblingly  luck  man.   And  what  of  the  vast  void  which  lies  between  that  first 
word,  no  matter  how  come  by  and  permuting  those  isolated  words,  into  sentences? 
Because  such  things  tend  to  make  me  lose  sleep  (psycholinguists  are  peculiar 
that  way) ,  I  have  decided  to  brave  the  ban  against  such  theories  and  suggest 
still  another  one  which  is  only  modestly  new.   (It  borrows  extensively  from  an 
early  linguist  by  the  name  of  Jespersen.)   And,  because  I  earnestly  believe 
what  I  say  (probably  the  more  so  because  the  proof  will  forever  by  lacking) ,  I 
have  decided  that  my  theory  shall  be  pretentiously  named  the  Homo  Melodens 
Theory. 

I  begin  by  assuming  that  the  missing  link  between  ape  and  talking  man 
(Homo  Loquens)  was  a  sort  of  singing  ape.  Song,  as  used  by  most  primitive 
-man  originally  served  only  a  biological  function  and  later  was  modified  to 
serve  its  cultural  information  transmission  functions  of  our  current  language 


87 


ferns.   In  order  to  develop  this  theory  and  its  implications,  I  must  digress 
briefly  to  explore  the  evolutionary  development  of  song  in  the  bird. 

Ground  nesting  birds  (Gallif ormes) ,  such  as  the  chicken  and  pheasant, 
and  the  dovelike  forms  (Columbif ormes)  represent  an  older  genetic  stock  in 
birds.   The  two  remaining  families  of  birds,  the  passerif ormes  (perching 
song  birds)  and  the  psittaciformes  (  parrot-like  forms)  represent  a  later 
evolutionary  stage  in  the  species  development  of  bird.   Only  the  passeri- 
f ormes  have  developed  song;  the  other  bird  families  have  only  developed 
call  notes  which  do  not  display  comples  phrasal  characteristics  of  true 
song.   Half  of  all  the  bird  forms  distributed  over  the  planet  belong  to 
the  passerine  family.   Fringilla  coelebs,  a  passerine  commonly  called  the 
European  chaffinche,  has  a  repetoire  of  some  six  distinct  song  patters. 
These  differing  song  patterns  are  produced  by  varying  the  number  of  notes 
repeated  in  each  phrase  unit  of  the  thematic  song  and  by  alteration  within 
the  phrase  constituents.   The  song  is  produced  by  the  exhalation  of  air 
(the  active  phase  of  bird  respiration,  while  man's  active  phase  is  inhalation) 
past  a  paired  set  of  tympaniforme  membranes  located  at  the  juncture  of  the 
bronchi.   This  larynx  homologous  structure  is  called  the  syrinx.   The  syrinx 
is  ennervated  by  a  bilaterally  paired  set  of  nerve  pathways,  the  ramus 
descedens  superior  of  the  hypoglossus  nerve  and  a  much  smaller  branch  of 
the  vagus  nerve.   Song  display  is  predominately  observed  in  the  male  bird 
during  the  mating  season  which,  in  the  chaffinche,  occurs  every  10  months. 
The  evidence  indicates  that  song  display  is  closely  associated  with  testos- 
terone hormonal  levels  in  the  bird.   The  newly  hatched  chaffinche,  during  his 
first  10  months  of  life,  moves  through  a  progression  of  regular  stages  in 
the  development  of  song  which  are  remarkably  similar  to  the  stages  observed 


88 


•   the  developing  child.   The  earliest  vocalizations  of  the  young  bird 
consist  of  simple  call  notes  of  distress  and  pleasure.   In  the  next  stage 
of  development,  the  subsong  stage,  the  young  bird  exhibits  a  highly  variable 
set  of  vocalizations  of  low  volume  which  utilize  the  earlier  call  notes  in 
rambling  sequences.   The  next  stage  of  development,  called  the  plastic  song 
stage,  is  characterized  by  the  use  of  elements  of  the  eventual  adult  song, 
but  without  the  necessary  phrasal  organization  which  marks  the  adult  pattern. 
This  progression  from  calls  to  subsong,  plastic  song  and  adult  pa-tern  is 
strangely  like  the  child's  progression  from  distress  noises,  through  babbling, 
to  isolated  words  and,  finally,  to  syntactic  patterns. 

Song  in  the  bird  serves  a  variety  of  biological  functions.   It  simul- 
taneously serves  as  territory  and  mate  advertisement  and  as  an  agressive 
display.   Assortative  mating,  produced  by  differential  attractiveness  of 
mate  advertisement  or  geographic  isolation,  is  the  primary  evolutionary 
mechanism  for  subspeciation,  development  of  distinct  life  forms  from  a  common 
ancestral  stock.   Elementary  evolutionary  theory  posits  four  systemic  influ- 
ences which  produce  the  genetic  changes  eventually  leading  to  speciation. 
Exploitive  effects  are  produced  by  particular  ecological  niches  which  favor 
the  exploitation  of  a  particular  genetic  characteristic  of  the  animal  form. 
Intermating  of  the  forms,  co-exploiting  the  niche,  produce  distinctive 
genetic  characteristics  which  may  differ  from  originally  con-specific  forms 
in  other  niches.   Epigenetic  effects  are  produced  by  particular  environmental 
stresses  acting  upon  the  developing  life  form  which  reveal  potentialities 
of  the  genotype  and  produce  a  distinctive  phenotypical  behavior.   Natural 
selection  acts  to  select  those  forms  of  a  species  which  display  selected 
characteristics  of  fitness  for  the  specific  stresses  encountered.   The 


89 


purely  genetic  effects  are  seen  in  notations  which  serve  to  modify  the 
potentialities  of  the  life  form. 

Species  specific  song  patterns  in  the  bird  serve  to  select  mates 
whose  genotypes  display  niche  adaptive  characteristics  in  order  to  produce 
offspring  which  continue  and  further  the  genetic  adaptiveness.   In  order 
that  such  genetic  information  is  transmitted  with  faithfulness,  some 
mechanism  for  assuring  that  the  offspring's  gene  advertisement  represents 
that  of  his  parents  must  be  present.   One  such  device  is  to  provide  for  a 
critical  period  during  which  the  developing  form  imprints  the  mate  adver- 
tisement of  his  parents  and  then,  subsequently,  after  striking  out  on  his  own, 
loses  the  capacity  to  modify  that  advertisement.   A  more  primitive  mechan- 
ism would  guaranty  fidelity  of  genetic  advertisement  by  coding  a  species 
pattern  of  display  directly  into  the  genes  which  unfolds  without  any  environ- 
mental interaction.   Both  of  these  forms  are  observed  in  birds.   Ring  doves 
develop  the  characteristic  calls  of  their  species,  even  if  deafened  during 
the  first  day  of  life.   The  song  sparrow,  on  the  other  hand,  will  not 
develop  normal  song  unless  his  hearing  is  intact,  even  though  he  may  be 
isolated  from  birth  and  never  hear  the  adult  model  of  his  species  song. 
The  chaff inche  represents  a  still  further  stage  of  environmental  control. 
The  chaffinche  requires  both  the  conspecific  adult  song  model  and  the  capa- 
city for  auditory  self-stimulation  in  order  to  develop  normal  song.   This 
continuum  from  a  completely  closed  phenotypic  behavior  developing  out  of  a 
genotype  without  environmental  interaction  to  an  open  genetic  system  requir- 
ing environmental  support  parallels  the  evolutionary  development  of  birds. 
A  completely  closed  instinct,  although  it  guaranties  species  identification, 
is  conservatively  slow  in  its  genetic  reaction  to  epigenetic  effects.   As 


90 


1   cenetic  system  is  increasingly  opened  to  environmental  influences,  it 
l  s  ereater  adaptiveness  to  environmental  stress.   Nonetheless,  there  must 
remain  some  degree  of  genetic  conservatism  in  the  trait  in  order  that  the 
life  form  may  continue  to  evolve  the  already  successful  genetic  character- 
istics of  his  species.   This  conservatism  is  evidenced  in  the  chaff inche 
in  that  the  young  bird  rejects  adult  song  models  from  species  other  than 
his  own.   If  a  chaffinche  is  reared  with  adult  song  birds  of  another 
species,  it  does  not  develop  either  its  own  adult  song  model  or  the  song  of 
the  species  different  model  provided  him.   Evidence  of  the  influence  of 
the  environment  is,  however,  observed  in  the  dialects  of  song  displayed 
by  chaffinches  genetically  isolated.   The  chaffinches  of  South  Africa  and 
New  Zealand  have  been  isolated  from  the  parent  European  stock  for  some  80 
years.   The  song  of  these  isolated  birds  can  be  detected  as  different, 
but  the  conservative  influences  of  the  genetic  template  of  the  song  pattern 
are  also  seen  in  that  these  dialect  differences  are  small.   The  species 
theme  is  still  clearly  recognizable,  despite  the  lengthy  separation,  and 
is  no  larger  than  that  observed  within  the  geographic  dialect  variations 
of  the  European  stock  itself. 

Perhaps  the  most  startling  of  the  set  of  fascinating  parallels  which  the 
song  bird  provides  is  the  results  of  a  series  of  experiments  performed  by 
Nottebohm  on  the  chaffinche.   If  the  left  hypoglossus  nerve  of  the  adult 
chaffinche  is  severed,  the  already  developed  song  of  the  bird  exhibits 
peculiar  defects  which  have  at  least  provocative  similarities  to  aphasic 
symptoms  in  man.   Specific  elements  of  the  normal  song  are  lost  from  the 
bird's  vocalizations  to  produce  an  effect  which  looks  something  like  the 
telegraphic  patterns  of  Broca's  aphasia.   If  the  right  hypglossus  is  severed, 
these  effects  are  not  observed.   Nor  is  the  effect  observed  when  the  left 
hypglossus  of  the  newly  hatched  bird  is  severed,  indicating  that  the 
disability  is  neither  motoric  or  peripheral  in  origin.  q-. 


If  song  represents  an  earlier  and  more  primitive  version  of  man's 
communication  system  serving  strictly  biological  functions  as  in  the 
case  of  bird  song,  a  number  of  incidental  observations  -might  be  explained. 
First,  jnany  current  languages  are  highly  tonal  in  character.   Differences 
in  pitch  accompanying  particular  sounds  in  such  languages  ad  Mandarin  and 
Uganda,  for  example,  represent  differences  in  meaning.   The  earliest 
forms  of  Latin  and  Greek  were  also  tonal  as  recorded  by  the  early  gram- 
marians!  Song  is  universally  employed  in  all  cultures  and,  in  fact, 
there  is  a  significant  correlation  between  the  primitiveness  of  the  culture 
and  the  amount  of  song  employed  in  various  ritualistic  functions.   It  is 
a  common  place  observation  that  stutterers  have  remarkably  little  diffi- 
culty singing  what  they  cannot  say.   The  earliest  expressions  of  children 
tend  to  be  song-like  in  character.   Most  cultures  highly  value  those  who 
sing  well  and,  in  this  culture,  such  talent  allows  performers  extraordinary 
freedom  from  most  of  the  social  toboos  whose  violation  would  otherwise 
produce  extreme  censure.   We  reward  both  singers  and  actors  with  both 
freedom  and  coin  far  out  of  proportion  to  the  rewards  given  to  our  artists 
or  chess  players.   Acting  is  an  evolved  form  of  singing;  the  actor  succeeds 
to  the  degree  that  the  form  of  what  he  says  is  pleasing.   What  the  actor 
says  is  determined  by  others  who  rarely  achieve  the  same  levels  of  compen- 
sation.  A  catalogue  of  the  presidents  of  the  United  States  indicates  a 
remarkable  sameness  in  the  dialects  used  by  our  leaders;  the  exceptions  are 
outstandingly  rare.   All  of  these  commonplace  observations  might  be  accounted 
for  by  the  assumption  that  language  evolved  from  the  primitive  biological 
expressions  of  song  as  a  mate  advertisement,  with  our  modern  uses  of  language 
still  bearing  traces  of  these  origins. 


92 


Observations  on  the  Measurement  of  Listening  Comprehension 

and  Learning  Efficiency  as  a  Function  of  Time- Compressed  Speech 

by  Skinner,    P.     H.  ,    &  Orr,    D.  B. 


93 


ABSTRACT 

Observations  on  the  Measurement  of  Listening  Comprehension 

and  Learning  Efficiency  as  a  Function  of  Time-Compressed  Speech 

Phyllis  H.  Skinner 

David  B.  Orr 

Using  five  rates  of  auditory  presentation  and  three  test  types,  15 
groups  of  elementary  education  students  at  California  State  College, 
Pennsylvania,  were  each  given  a  passage  at  one  of  the  15  possible  combi- 
nations of  word  rate  by  test  type.  Results  showed  that  the  interaction 
of  word  rate  and  test  type  was  insignificant;  that  significant  differences 
were  obtained  among  word  rates  (but  only  above  325  words  per  minute) ;  and 
that  the  multiple  choice  test  type  showed  scores  significantly  greater 
than  those  for  two  types  of  cloze  test.  When  learning  efficiency  was 
operationally  defined  as  number  of  questions  answered  correctly  per  minute 
of  time  spent  listening,  it  was  shown  that  in  general  the  higher  word  rates 
of  presentation  were  more  efficient.   The  point  was  made  that  many  results 
of  time-compressed  speech  experiments  are  dependent  upon  methodological 
considerations,  and  particularly  on  (often  unknown)  characteristics  of  the 
measurement  procedures  used.   Further  research  along  these  lines  was  urged. 

************* 

Phyllis  H.  Skinner.   Dr.  Skinner  is  presently  Professor  of  Elementary 
Education  at  California  State  College,  California,  Pennsylvania. 

David  B.  Orr.   Dr.  Orr  is  presently  a  Statistician  with  the  National 
Center  for  Education  Statistics,  Department  of  Health,  Education  and 
Welfare,  Washington,  D.C.,  and  also  President,  AUDO-READ  Systems,  Inc., 
Silver  Spring,  Md. 


94 


Observations  on  the  Measurement  of  Listening  Comprehension 
and  Learning  Efficiency  as  a  Function  of  Time-Compressed  Speech 

Phyllis  H.  Skinner 
David  B.  Orr 

(This  paper  is  given  below  essentially  as  it  was  presented  at  the  Conference. 
Dr.  Skinner  presented  the  material  in  Part  I,  and  Dr.  Orr  presented  the 
material  in  Part  II.) 


PART  I  (Dr.  Skinner) 

The  material  which  I  want  to  present  to  you  today  is  based  upon  my 
doctoral  thesis  (Listening  Comprehension  of  Time-Compressed  Speech  as  a 
Function  of  Variations  in  Word  Rate  and  Type  of  Evaluation  Instrument 
(Unpublished),  University  of  Pittsburgh,  1972).  The  study  was  intended 
to  determine  the  extent  to  which  test  type  and  time-compressed  speech  rates 
made  a  difference  in  listening  comprehension  scores  for  groups  of  elementary 
education  students  at  California  State  College,  Pennsylvania. 

Randomly  selected  groups  of  students  were  given  an  informational  pas- 
sage concerning  the  life  of  Franklin  Pierce,  15th  President  of  the  U.S. 
Preliminary  and  pilot  studies  were  used  to  smooth  out  the  procedures.   In 
the  final  study,  comprehension  was  tested  by  three  test  types  (multiple 
choice,  cloze — random  deletion,  and  cloze — systematic  deletion).   Five  word 
rates  of  presentation  were  used:  175  (essentially  normal  recording  rate), 
225,  275,  325,  and  375  wpm.  Each  of  the  15  possible  combinations  of  word 
rate  and  test  type  was  randomly  assigned  to  one  of  15  different  groups  of 
subjects.  Analyses  of  variance  and  t-tests  were  used  to  identify  significant 
differences. 

Before  discussing  the  results,  I  had  better  say  something  about  the 
test  types,  some  of  which  may  be  unfamiliar  to  you.  The  multiple  choice 
test  was  of  the  standard  variety  with  four  choices,  and  was  composed  of 

95 


70  items.  The  other  two  tests  were  the  cloze  test  using  a  systematic 
deletion  formula  and  the  cloze  test  using  a  random  deletion  formula.   To    \ 
construct  the  systematic  cloze  test,  I  selected  a  word  in  the  first  sen- 
tence and  systematically  deleted  every  tenth  word  thereafter,  replacing 
them  with  blanks.  For  the  random  cloze  test,  I  went  through  and  numbered 
all  the  words  and,  using  a  table  of  random  numbers,  randomly  deleted  the 
same  number  of  words  from  the  passage  and  substituted  blanks.  The  subjects 
were  required  to  fill  in  the  blanks  after  hearing  the  passage.   Subsequently 
70  blanks  were  randomly  selected  from  each  test  for  scoring  so  that  each  of 
the  three  tests  would  have  the  same  number  of  scored  items.   Items  were 
counted  correct  only  if  exact  word  deleted  was  filled  in  by  the  subject. 

The  results  of  the  study  showed  us  that  listening  comprehension  with 
college  students  declined  significantly  with  word  rate  increases,  but  not 
until  the  speed  of  375  wpm.   That  is  somewhat  contradictory  to  some  previous 
studies  where  in  comprehension  dropped  significantly  at  lower  rates.  However j 
my  college  students  seemed  to  have  no  significant  difference  in  comprehension 
at  325  words  per  minute. 

As  far  as  the  test  type  was  concerned,  we  found  that  the  students  did 
significantly  better  on  the  multiple  choice  test.  (Of  course,  they  had  had 
lots  of  practice  in  taking  that  type  of  test  and  very  few  had  more  practice 
in  taking  a  cloze  test,  than  just  the  warm-up  test  taken  prior  to  the  final 
test  passage.)  Thus,  there  was  a  significant  difference  between  the  results 
on  the  multiple  choice  test  and  those  on  the  cloze  tests.  However,  there 
was  no  significant  difference  between  the  cloze  systematic  deletion  formula 
and  the  cloze  random  deletion  formula.  (In  checking  the  reliabilities,  the 
random  formula  tends  to  give  a  slightly  more  reliable  test.   I  would  like  to 


96 


u 

O     4* 


in 

v   >s  a 

-<     W     OS 

•3*     «      V 

*»      N      N 

-4       O       O 

sou 


M 


i\ 


\    \ 

\        \ 


\ 


\ 
1 


\V 


w 


CM  Q 


\\ 


i; 

1 

o 

oo 

>o 

<3- 

OJ 

o 

00 

>o 

t 

CM 

O 

co 

C) 

N 

CM 

CM 

CM 

03 

o 

ONINEXSIl  dO  HXQNIN/IDHHHOD  SNOIXSHffo  H3SWT1N 


97 


see  somebody  else  do  some  more  research  using  these  two  kinds  of  test, 
because  I  in  the  back  of  my  mind,  I  feel  that  the  random  form  may  be  superior 
to  the  systematic  form.)    I 

The  third  finding  of  the  study  was  that  there  was  no  significant 
interaction  between  the  type  of  test  and  the  rate  of  presentation,  suggesting 
that  the  three  test  types  behaved  similarily  as  word  rates  increased. 

In  conclusion,-  I  think  that  we  can  and  should  definitely  use  compressed 
speech  for  education  presentations,  judging  from  the  results  of  this  study. 
However,  one  caution  that  seems  evident  is  that  you  can  not  use  the  multiple 
choice  and  the  cloze  tests  as  equivalents  because  according  to  the  results 
of  my  study  they  measure  listening  comprehension  somewhat  differently. 

As  a  byproduct  of  my  study  Dr.  Orr  and  I  have  developed  some  interesting 
thoughts  on  learning  efficiency  based  on  data  from  the  study.  Dr.  Orr  will 
tell  you  about  that. 

PART  II  (Dr.  Orr) 
Before  I  discuss  learning  efficiency,  I  should  like  to  step  back  several 
years  to  the  previous  conference  here  in  Louisville,  at  which  time  I  made 
some  remarks  about  measurement  issues  and  studies  on  time-compressed  speech. 
As  I  listened  to  the  papers  this  morning,  I  felt  that  several  of  those  issues 
are  being  addressed  now.  I  think  that's  a  step  in  the  right  direction.  For 
example,  we  are  concerned  as  researchers  in  the  area  of  time-compressed 
speech  with  the  problem  of  determining  the  impact  of  treatment  with  time- 
compressed  speech.   That  requires  dependable  measurement.   I  had  suggested 
that  there  were  other  ways  to  look  at  the  measurement  problem  than  the  typical 
multiple  choice  type  of  test,  and  Dr.  Skinner  for  one,  and  I  am  sure  others 
of  you,  has  focused  her  work  on  exploring  different  kinds  of  tests.   However, 

98 


I  think  that  we  must  continue  to  explore  different  ways  of  measuring  and 
documenting  the  impact  of  exposure  to  time-compressed  speech. 

Another  suggestion  that  I  made  6  years  ago,  had  to  do  with  different 
kinds  of  response  modes  as  ways  of  determining^  treatment  impact.   I  notice 
that  Dr.  McCloskey  has  used  a  multiple  selection  response  mode  which  is  the 
kind  of  approach  that  I  had  in  mind  in  my  remarks  at  the  last  conference. 
So,  1  feel  that  we  have  made  some  methodological  advances,  though  more  such 
Work  is  undoubtedly  needed. 

There  are  perhaps  some  additional  areas  we  can  explore  along  these  lines. 
I  think  we  need  to  consider  the  type  of  domain  that  is  being  measured  when 
we  work  with  the  time-compressed  speech.   Now  this  has  a  great  deal  of  import 
for  the  type  of  test  we  select,  for  the  reliability  and  validity  of  the  test, 
and  for  the  conclusions  that  we  draw.   To  use  a  simple  illustration,  just  in 
terms  of  number  of  items,  if  we  are  going  to  try  to  measure  a  given  domain, 
the  number  of  items  necessary  to  arrive  at  a  dependable  conclusion  is  a  cri- 
tical dimension.   I  don't  see  much  evidence  that  people  consider  this 
when  they  design  their  time-compressed  experimentation.   For  example,  in  the 
domain  of  physical  sex  it  doesn't  take  very  many  test  questions  to  determine 
what  the  sex  of  the  respondent  is.   By  and  large  one  question  is  enough.   If 
on  the  other  hand,  your  intention  is  to  measure  the  comprehension  of  a  rather 
complex  passage,  it  takes  many  more.   How  many  more,  interacts  to  a  consider- 
able degree  with  the  type  of  material,  the  domain,  which  is  being  studied.   I 
think  we  need  more  work  on  such  considerations. 

One  other  methodological  comment.   There  has  been  some  legitimate  concern 
with  the  problem  of  practice  effect — the  problem  of  acclimatization,  if  you 
will.  In  compressed  speech  experimentation  you  will  get  rather  different 


99 


results  if  you  study  error  rate  or  reaction  time  or  whatever,  as  a  function 
of  levels  of  compression,  using  totally  naive  subjects,  as  opposed  to  using 
subjects  that  have  had  enough  experience  with  time-compressed  presentations 
that  they  are  no  longer  stumbling  over  the  novelty  of  the  effect.   I  think 
that  some  of  the  results  which  have  been  reported  here  may  conflict  on  the 
basis  of  different  controls  on  this  variable.  We  found  at  the  American 
Institutes  for  Research  that  about  15  minutes  of  simple  practice  listening 
to  presentations  at  roughly  twice  normal  speed  was  enough  to  wash  out  a  great 
deal  of  the  novelty  effect.   In  any  case  some  preliminary  exposure  will  pro- 
mote a  firmer  base  for  comparison  when  you  begin  your  treatment  comparisons, 
and  greater  comparability  among  experiments. 

Now,  with  respect  to  learning  efficiency — there  are  obviously  many  ways 
you  can  operationally  define  learning  efficiency,  and  many  specifically 
designed  experiments  which  could  be  conceived  to  approach  this  issue.  However s 
Dr.  Skinner  and  I  drew  some  evidence  out  of  her  dissertation  data  which  we 
felt  was  worth  presenting  to  you  (though  it  should  not  be  thought  of  as  defi- 
nitive experimentation).   Figure  1  shows  a  plot,  for  each  of  the  three  types 
of  tests  in  Dr.  Skinner's  data,  of  "number  of  questions  correct  per  minute 
of  listening  time"  (of  exposure  to  the  original  passage) ,  as  a  function  of 
word  rate  of  presentation.   It's  a  very  simple  operational  definition  of 
learning  efficiency.   "How  many  questions  can  you  answer  correctly  on  each 
of  these  kinds  of  tests,  as  a  function  of  the  amount  of  time  spent  listening 
to  the  passage."  Obviously,  the  more  questions  correct  per  unit  of  time  spent 
listening,  the  higher  the  efficiency. 

The  results  indicated  that  for  all  three  test  types  the  learning    \ 
efficiency  rose  essentially  monotonically  with  increasing  compression  (speed) , 


100 


with  the  single  exception  of  the  375  wpm  value  on  cloze-systematic.   These 
results  imply  that  in  terms  of  the  gross  amount  of  information  comprehended, 
more  information  was  acquired  for  each  minute  of  time  spent  listening  at  the 
higher  speeds.   In  other  words,  the  decline  in  absolute  comprehension  typically 
experienced  as  presentation  rates  are  increased  was  more  than  offset  by  the 
increased  amount  of  information  received  for  the  time  invested  in  listening. 
Quite  obviously  this  could  not  go  on  over  an  indefinite  range  of  word  rates. 
Nonetheless,  it  would  appear  that  there  is  significant  efficiency  to  be 
obtained  at  some  of  the  higher  rates,  at  least  with  this  population  and  this 
type  of  material.   (You  must  qualify  the  conclusion  in  this  respect,  since 
it  is  sure  to  be  bound  to  some  degree  to  subject  and  type  of  material.) 

The  second  point  that  I  would  like  to  call  attention  to  with  respect  to 
these  plots,  is  that  it  is  quite  clear  from  this  graph  that  the  type  of  test 
makes  a  difference  from  the  standpoint  of  absolute  levels  of  efficiency. 
Now,  I  am  quite  willing  to  believe  that  one  group  of  subjects  did  not  learn 
significantly  more  than  another — it's  a  measurement  difference.  You  get 
apparently  higher  efficiency  with  multiple  choice  tests,  but  this  is  just 
a  function  of  the  type  of  measurement  involved.   Of  course,  similar  results 
were  attained  with  the  cloze  tests  in  terms  of  their  relative  characteristics. 
They  were  lower  in  absolute  level,  but  similar  in  shape. 

So,  to  sum  up  then,  first  it's  my  feeling  that  there  are  still  serious 
measurement  problems  which  remain  and  that  sufficient  attention  is  not  paid 
to  the  measurement  aspect  of  research  in  this  field.   Secondly,  concepts  such 
as  learning  efficiency  are  concepts  that  can  now  be  studied  through  the  use 
of  time-compressed  speech,  and  deserve  our  continued  attention.   Finally, 
preliminary  data  suggest  that  some  compression  produces  beneficial  results  in 
terms  of  efficiency,  and  can  probably  be  used  judiciously  at  the  present  time. 

101 


REFERENCES 


Orr,  David  B. ;  Friedman,  Herbert  L. ;  and  Williams,  Jane  C.C.   "The 

Trainability  of  Listening  Comprehension  of  Speeded  Discourse."  Journal 
of  Educational  Psychology,  June  1965,  56,  148-156. 

Orr,  David  B.,  and  Friedman,  Herbert  L.  "The  Effect  of  Listening  Aids  on  the 
Comprehension  of  Time-Compressed  Speech."  The  Journal  of  Communication, 
September  1967,  XVII  (3),  223-227. 

Orr,  David  B<  and  Friedman,  Herbert  L.   "Effect  of  Massed  Practice  on  the 
Comprehension  of  Time-Compressed  Speech."  Journal  of  Educational 
Psychology,  1968,  59  (1),  6-11. 

Orr,  David  B.;  Friedman,  Herbert  L. ;  and  Graae,  Cynthia  N.   "Self-Pacing 

Behavior  in  the  Use  of  Time-Compressed  Speech."  Journal  of  Educational 
Psychology,  1969,  60,  28-31 


102 


The  Effects  of  Irrelevant  Concurrent  Psychomotor  Activity  on  the 
Ability  to  Comprehend  Compressed  Speech 
by  Billman,    J.  T. 


103 


THE  EFFECTS  OF  IRRELEVANT  CONCURRENT 
PSYCHOMOTOR  ACTIVITY  ON  THE  ABILITY 
TO  COMPREHEND  COMPRESSED  SPEECH 


by  Joe  T.  Billman 
ABSTRACT 
This  study  examined  the  effects  on  comprehension  of  per- 
forming varying  amounts'  of  irrelevant  concurrent  psychomotor  . 
activity  while  listening  to  compressed  speech  at  various  informa- 
tion rates.   A  significant  association  between  comprehension  level 
and  amount  of  psychomotor  activity  performed  was  not  found  at  the 
listening  rates  of  250  and  275  words  per  minute.   The  analysis  did 
suggest  that  interaction  was  beginning  to  take  place  at  the  listen-  . 
ing  rate  of  275  words  per  minute  for  subjects  performing  three  bits 
of  psychomotor  activity. 

Two  pilot  studies  (N  =  40  and  N  =  -32)  indicated  that 
simple ,  discrete,  psychomotor  activity  such  as  dotting  and  copying 
could  be  performed  at  225,  250)  and  275  words  per  minute,  provided 
adequate  pause  time  was  provided,  without  degradation  of  listening 
comprehension  (P_<.05).   One  hundred  and  twenty  subjects  listened 
to  pre-recorded  material  (Part  Two,  Form  1A  of  the  Listening  Test 
of  the  Sequential  Tests  of  Educational  Progress)  with  headsets  and 
performed  50  sequences  of  psychomotor  activity  interspersed  through- 
out.  Four  seconds  pause  time  was  provided  to  perform  the  psycho- 
motor activity.   Psychomotor  activity  was  based  upon  information 
theory  and  quantified  at  three  levels:   one  bit,  two  bit,  and  three 
bit.   Stimulus  items  were  presented  in  the  visual  channel.   Subjects 
responded  by  writing  a  response  code  group"  on  a  response  sheet. 
Comprehension  was  measured  by  36  multiple-choice  questions.   Anal- 
ysis of  variance  for  comprehension,  by  listening  rate,  versus  psycho- 
motor activity  level  indicated  the  performance  of  up  to  three  bits 
of  psychomotor  activity  did  not  produce  a  significant  effect  on 
comprehension  (P<  .10)  . 

The  results  suggested  that  learners  would  be  able  to 
listen  to  compressed  speech  at  250  and  275  words  per  minute  and  also 
perform  up  to  three  bits  of  complex,  concurrent  psychomotor  activity 
such  as  switching  perceptual  channels,  observing  visuals,  copying 
data,  and  making  overt  responses  without  degradation  of  comprehension 
if  adequate  pause  time  was  provided. 

Dr.  billman  is  currently  serving  as  a  Supplementary  Faculty  member 
at  Oklahoma  City  University,  Oklahoma  City,  Oklahoma  and  at  Oscar  Hose 
Jr.  College,  Midwest  City,  Oklahoma, 

104 


THE  EFFECTS  OF  IRRELEVANT  CONCURRENT  PSYCHOMOTOR 

ACTIVITY  ON  THE  ABILITY  TO  COMPREHEND 

COMPRESSED  SPEECH 


The  literature  suggests  a  broad  range  of  time  saving 
possibilities  for  the  use  of  compressed  speech.   It  does  not, 
however,  provide  any  substantial  body  of  information  relating 
to  the  long-term  use  of  compressed  speech.   If  the  use  of  com- 
pressed speech  pervades  the  instructional  spectrum,  how  will  it 
compliment  or  accommodate  other  media  forms  such  as  projectors 
and  viewers,  particularly  if  manual  operation  is  involved?  Will 
it  be  necessary  to  automate  other  media  items  to  preclude  over- 
loading the  individual  with  competing  psychomotor  tasks  which, 
in  some  ways  at  least,  must  share  the  brain's  capacity? 

This  study  examined  the  effects  of  concurrent  psycho- 
motor activity  on  the  ability  to  listen  to  and  comprehend  com- 
pressed speech. 

Two  pilot  studies  (N— 40  and  N  — 32)  indicated  that  simple 
discrete,  psychomotor  activity  such  as  dotting  and  copying  could 
be  performed  at  225,  250,  and  275  words  per  minute,  provided 
adequate  pause  time  was  provided,  without  degradation  of  listen- 
ing comprehension  (p<.05). 

Purpose  of  the  Study 

The  problem  of  the  study  was:   Is  the  performance  of 
irrelevant  psychomotor  activity  concurrent  to  listening  to  com- 
pressed speech  associated  with  a  reduction  in  listening  compre- 
hension?  The  purpose  of  the  study  was  to  examine  the  effects, 
if  any,  of  performing  varying  amounts  of  irrelevant  concurrent 
psychomotor  activity  while  listening  to  compressed  speech  at 
various  information  rates  to  determine  if  there  would  be  a  differ- 
ence in  listening  comprehension  between  subjects  who  listen  only, 
to  compressed  speech,  and  those  who  also  perform  psychomotor 
activity. 

Theoretical  Framework 

The  study  was  based  primarily  on  the  single -channel 
(filter  theory)  position  of  Broadbent  (1958,  p.  42)  and  opposing 
multiple -channel  processing  theory  of  Garner  (1962,  p.  116). 

According  to  the  single-channel  position  (Broadbent, 
1958,  p.  42)  information  is  processed  from  only  one  input  at  a 
time.   Attention  is  alternated  or  switched  between  inputs  very 
rapidly  by  a  selective  filter  or  analyzer  at  the  entrance  to  the 


105 


central  processing  mechanism.   Conscious  attention  can  be  given 
to  only  one  input  channel  at  any  one  time  (Norman,  1968). 

The  multiple- channel  theorists  (Garner,  1962,  p.  116; 
Deutsch  and  Deutsch,  1963)  take  the  position  that  information 
may  be  processed  from  several  inputs  at  the  same  time.   All 
messages  are  assumed  to  reach  the  central  processing  mechanisms. 
Division  and  alternation  of  attention  is  possible,  providing 
adequate  time  is  available  for  the  switching  (Treisman,  1969). 
Attention  is  paid  the  most  important  inputs  during  optimal  loads 
but  could  be  expected  to  concentrate  on  one  input  under  heavy 
load  conditions  (Deutsch  and  Deutsch,  1963) . 

Alternation  of  attention  (from  one  source  to  another  and 
back  again)  was  estimated  to  be  approximately  one-third  of  a 
second  or  about  334  to  336  milliseconds  (Cherry  and  Taylor,  1954; 
Reid  and  Travers,  1968).   However,  Moray's  (1970)  caution  that 
all  such  estimates  must  be  considered  tentative  was  observed. 

The  literature  suggested  that  memory  trace  and  short- 
term  memory  (Miller,  1956;  Broadbent ,  1958,  p.  216;  Travers, 
1970,  p.  144)  are  essential  to  the  acquisition  of  information 
in  any  processing  mode.   Inputs,  single  or  multiple,  which  exceed 
optimal  loads  will  produce  dysfunctional  processing  of  informa- 
tion and  a  consequent  reduction  of  accurately  decoded  informa- 
tion to  control  behavioral  response.   In  any  case,  the  litera- 
ture indicated  that  at  higher  than  optimal  information  loads, 
insufficient  time  would  be  available  for  accurate  processing  of 
information. 

Several  studies  (Pollack,  1952;  Garner,  1953;  Pollack 
and  Fricks,  1954;  Miller,  1956;  Hsia,  1971)  suggested  that  the 
maximum  range  of  information  processing  ability  associated  with 
making  absolute  judgments  was  two  bits/sec.  to  seven  bits/sec, 
depending  on  dimensionality  of  stimuli  used.   The  requirement 
to  perform  irrelevant  psychomotor  activity  in  this  range  would 
be  expected  to  cause  interference  to  other  activity  such  as  lis- 
tening to  compressed  speech.   Shannon's  (1951)  estimate  that 
there  is  approximately  one  (1)  bit/letter,  or  5.5  bits/word 
(average  of  4.5  letters  per  word  plus  one  space)  transmitted  by 
the  English  language  was  accepted. 

Overmann  (1969)  found  that  speech  compression  reduced 
the  redundancy  cues  of  individual  words  and  the  amount  of  pro- 
cessing time  as  word  rate  increased.   This  suggested  that  loss 
of  processing  time,  as  word  rate  increases,  is  the  fundamental 
cause  for  the  reaching  of  channel  capacity.   Overmann  found  that 
restoration  of  processing  time  improved  comprehension. 

The  findings  of  Foulke  and  Sticht  (1967)  and  Foulke , 
Amster,  Nolan,  and  Bixler  (1962),  that  comprehension  declined 
more  rapidly  than  intelligibility,  that  comprehension  showed  a 
slight  decline  up  to  275  WPM,  but  declined  rapidly  thereafter 
was  accepted  as  a  basis  for  hypothesizing. 


106 


The  above  findings  strongly  suggest  that  the  intrusion 
of  other  activity,  such  as  looking  at  visuals,  manipulating 
equipment,  and  performing  response  activity  would  further  impinge 
on  processing  time.   It  also  suggested  that  other  media  forms 
might  inject  conflicting  information,  producing  ambiguities, 
thus  making  the  information  processing  task  more  difficult. 

Peters  (1972)  reported  that  note-taking  had  a  harmful 
effect  on  comprehension.   Ludrick  (1974)  reported  that  forced- 
pacing  was  as  effective  as  self-pacing  using  visuals  on  video 
tape  with  compressed  speech  and  that  subjects  preferred  the  tele- 
vision mode  because  it  left  them  free  to  concentrate  on  learning 
rather  than  equipment  operation. 

Speech  rates  are  expressed  in  words  per  minute  (WPM) . 
Speech  compression  rates  are  expressed  in  terms  of  a  percent  to 
indicate  the  portion  of  the  original  time  saved. 

The  theoretical  base  provided  a  general  consensus  that 
the  brain  would  have  difficulty  processing  more  than  one  channel 
of  communication  at  any  one  time,  particularly  at  high  informa- 
tion rates. 


METHOD 

Subjects 

The  sample  population  consisted  of  120  volunteer  sub- 
jects, 98  females,  and  22  males,  aged  19  to  26  years.   All  were 
enrolled  in  the  College  of  Education  at  the  University  of  Okla- 
homa in  the  Fall  of  1974.   None  had  previous  experience  with 
compressed  Speech. 

Test  Instruments 

The  Listening  Test  of  the  Sequential  Tests  of  Educational 
Progress  (STEP)  was  used  to  provide  pre-recorded  listening  mater- 
ial.  Part  One  of  the  Listening  Test  provided  approximately  10 
minutes  practice  in  listening  to  compressed  speech  and  in  exper- 
imental procedure.   Part  Two  of  the  Listening  Test,  consisting 
of  five  selections,  was  administered  to  measure  listening  compre- 
hension at  250  and  275  WPM  over  four  psychomotor  activity  levels. 
Listening  material  was  compressed  using  a  Model  CC-103,  Variable 
Speech  Control  Copycorder,  manufactured  by  the  Magnetic  Video 
Corporation. 

The  psychomotor  activity  was  constructed  to  represent 
attendant  activity  such  as  looking  at  visuals,  switching  percep- 
tual channels  and  making  responses.  This  type  of  activity  would 
be  required  in  a  self-study  situation  to  manipulate  audio  visual 
equipment  or  other  media  items  provided  the  learning  environment 
was  not  automated. 

The  psychomotor  activity  used  was  based  upon  information 
theory  and  was  quantified  at  three  levels:   1  bit,  2  bits,  and 


107 


3  bits.   Stimulus  code  groups  were  presented  in  the  visual 
channel  by  the  use  of  35  MM  slides  arranged  in  random  order 
(See  Appendix  A  for  stimulus  code  group  format).   Subjects  were 
required  to  respond  with  a  response  code  group  (See  Appendix  A 
for  response  code  format) .   Fifty  psychomotor  sequences  were 
interspersed  throughout  the  listening  material. 

Pause  time  of  four  seconds  was  provided  to  perform  the 
psychomotor  activity.   Although  superimposing  psychomotor  activ- 
ity on  a  listening  activity  would  have  a  higher  probability  of 
producing  interaction  effects  resulting  in  a  loss  of  comprehen- 
sion, results  would  be  confusing  and  less  precise.   If,  as  the 
literature  suggested,  that  time  between  speech  elements  was 
critical  to  processing  information  already  received,  it  was 
reasoned  that  an  effect  produced  by  performing  psychomotor 
activity  during  pause  time  would  be  more  precise  and  powerful 
and  would  provide  a  more  emperical  basis  for  generalizing. 

All  experimental  listening  material  and  instructions 
were  presented  by  pre-recorded  audio  tape.   The  stimulus  code 
groups  were  projected  on  a  white  matte  screen  (See  Appendix  B 
for  slide  format).   Subjects  were  instructed  to  perform  the 
psychomotor  activity  sequence  by  1000  hertz  tones  interspersed 
throughout  the  listening  material.   Psychomotor  response  codes 
were  written  in  a  block  on  a  response  sheet. 

Subjects  answered  36  multiple-choice  questions  over  the 
listening  material  by  circling  the  letter  of  the  correct  response 
in  a  test  booklet. 

Apparatus 

A  Wollensak,  Model  2551,  cassette  recorder  provided  the 
audio  and  controlled  a  Kodak  Ecktagraphic,  Model  AF  2  (F  2:8 
lens)  slide  projector.   The  audio  was  distributed  by  shielded 
audio  cable  through  jackboxes  to  600  ohm  headsets.   Each  subject 
had  an  individual  volume  control. 

Experimental  Design 

The  subjects  were  assigned  to  eight  experimental  groups: 
Four  listened  at  250  WPM,  and  four  listened  at  275  WPM.   The 
four  groups  within  each  listening  rate  performed  at  separate 
levels  of  psychomotor  activity:   0  level  (without),  1  bit,  2  bit, 
and  3  bit. 

A  2  x  4  Factorial  design  paradigm  was  used: 


108 


PMA 

Lis 
25C 

Al 
A2 

A3 
A4 

tening  Rate 
WPM 

Bl 

275 
B 

WPM 
2 

Without 

I 

A1B1 

II 

A1B2 

1  bit 

III 

A2B1 

IV 

A2B2 

2  bits 

V 

A3B1 

VI 

A3B2 

3  bits 

VII 

A4B1 

VIII 

A4B2 

RESULTS 

The  data  was  analyzed  using  the  raw  scores  obtained  from 
the  36  item  test  of  comprehension  and  the  50  psychomotor  sequences 
required.   The  data  is  summarized  in  Tables  1  and  2. 

Table  1 
Summary  of  Means  for  Psychomotor  Activity 
Listening  Rate 
PMA        250  WPM  275  WPM 

0  I  -  II  - 

1  III  -  49.933        IV  -  50.00       X  49.966 

2  V  -  50.00        VI  -  49.666      X  49.833 

3  VII  -  49.533      VIII  -  49.733      X  49.633 

X  -  49.822        X  -  49.799 


109 


Table  2 

Summary  of  Means  for  Listening  Comprehension 

Listening  Rate 


PMA 

250 

WPM 

275  WPM 

0 

I  - 

26.133 

II  -  26.933 

X 

26.533 

1 

III  - 

25.066 

IV  -  26.933 

X 

25.999 

2 

V  - 

28.533 

VI  -  26.733 

X 

27.633 

3 

VII  - 

26.800 

VIII  -  24.733 

X 

25.766 

X  - 

26.633 

X  -  26.33 

Compr ehens  ion 

Performance  on  listening  comprehension  was  generally 
close  except  for  Groups  V  and  VIII.   Figure  1  shows  an  interac- 
tion plot  of  the  mean  comprehension  scores  over  the  two  listen- 
ing rates  and  the  four  psychomotor  activity  levels.   The  lower 
performance  by  Group  VIII  was  consistent  with  the  expected 
interaction  of  the  higher  listening  rate  and  the  highest  psycho- 
motor activity  level.   Comprehension  scores  for  the  groups 
listening  at  250  WPM  did  not  reveal  any  pattern. 


8  1 

^"^^C^ 

^£t 

7  - 

^\°^ 

—  to^5^^< 

6  - 

5  • 

^^ 

M- 

250  WPM 

275  WPM 

Figure  1.   Interaction  plot  of  comprehension  means 
by  listening  rate  over  psychomotor 
activity  levels. 


Listening  comprehension  data  for  the  groups  listening  at  275  WPM 
evidenced  a  pattern.   Group  means  were  identical  at  the  0  and 
1  bit  psychomotor  activity  levels  but  decreased  at  the  2  and  3 
bit  levels.   There  was  no  strong  evidence  of  a  pattern  for  compre- 
hension mean  scores  between  the  listening  rates. 


110 


Psychomotor  Activity 

Of  the  90  subjects  performing  psychomotor  activity,  only 
10  failed  to  perform  all  of  the  required  50  sequences.   A  total 
of  only  16  sequences  were  performed  incorrectly:   one  at  the 
1  bit  level,  5  at  the  2  bit  level,  and  10  at  the  3  bit  level. 
Figure  2  shows  an  interaction  plot  of  the  number  of  sequences 
performed  incorrectly  over  the  psychomotor  activity  levels.   The 
plot  indicates  an  approximate  logarithmic  progression  in  the 
number  of  sequences  incorrectly  performed  over  the  psychomotor 
activity  levels. 


N 

10 

u 

b 

5 

e 

r 

0 

PMA 

Level 

1 


bits 


Figure  2.   Interaction  plot  of  number  of  psychomotor 
activity  sequences  performed  incorrectly 
over  psychomotor  activity  levels. 


Analysis  of  Data 

A  two-way  analysis  of  variance  for  listening  comprehen- 
sion versus  psychomotor  activity  was  computed.   A  summary  is 
provided  in  Table  3. 


Table  3 

Two-way  Analysis  of  Variance  for  Listening 
Comprehension  Versus  Psychomotor  Activity 


Source  of  Variation 

SS 

df 

MS 

F 

p 

Between  Rates 
Between  Modes 
Interaction  (R  X  M) 
Error  Variance 

2.695 
62.169 
84.492 
2930.59 

i 

3 

3 

112 

2 
20 
28 
26 

6y5 
723 
164 
166 

U 

0 
1 

1030 
7920 
0764 

U 

0 
0 

/4/6 
5037 
3626 

Total 

3079.95 

119 

The  analysis  indicates  that  there  was  no  difference  in 
listening  comprehension  between  subjects  who  listened  only  and 
those  who  performed  psychomotor  activity  (P<.10). 

A  two-way  analysis  of  variance  for  psychomotor  activity 
performed  versus  listening  rate  was  computed.   A  summary  is 
provided  in  Table  4. 


Ill 


Table  4 

Two-way  Analysis  of  Variance  for  Psychomotor 

Activity  Performed  Versus 

Listening  Rate 


Source  of  Variation 

SS 

df 

MS 

F 

Between  Rates 
Between  Modes 
Interaction  (R  X  M) 
Error  Variance 

0.352 

1.992 

0.762 

32.928 

1 

2 

2 

84 

0 
0 
0 
0 

.352 
.996 
.381 
.392 

0, 
2 

0, 

.8966 
.5403 
.9713 

Total 

36.034 

89 

The  analysis  indicates  that  there  was  no  association 
between  the  amount  of  psychomotor  activity  performed  and  the 
listening  rate  (P<.10).   The  F  ratio  of  2.54  obtained  between 
modes  was  significant  (P<C.05)  but  because  of  the  small  amount 
of  variation  in  the  row  means,  was  considered  unimportant. 

The  analyses  indicated  that  the  several  levels  of  psycho- 
motor activity  performed  did  not  produce  significant  degradation 
to  listening  comprehension  at  250  and  275  WPM. 

To  investigate  the  equivalency  of  the  experimental  group* 
a  one-way  analysis  of  variance  was  computed  on  Grade  Point  aver- 
ages of  the  subjects  for  all  groups.   A  significant  F  ratio  was 
not  obtained  (P<^.05).   It  was,  therefore,  concluded  that  the 
groups  were  essentially  equal  in  respect  to  GPA. 

Information  Transmission  Rates 

The  five  test  selections  (6-10)  averaged  4.508  letters 
per  word.   Using  Shannon's  (1951)  estimate  of  5.5  bits/word, 
based  on  an  average  of  4.5  letters  per  word,  the  information 
transmission  rate  averaged  approximately  23.1  bits/sec.  at  250 
WPM  and  25.3  bits/sec.  at  275  WPM.   The  transmission  rate  of 
25.3  bits/sec.  is  very  close  to  the  maximum  rate  for  reception 
of  speech  suggested  by  Quastler  and  Wulf f  (1955) . 


DISCUSSION 

The  provision  of  pause  time  to  perform  the  psychomotor 
activity  reduced  the  probability  of  interaction  taking  place. 
Despite  this  fact,  the  comprehension  mean  scores  for  the  group 
which  listened  at  275  WPM  and  performed  the  highest  level  of 
psychomotor  activity  (Group  VIII)  evidenced  a  marked  reduction 
This  strongly  suggested  that  channel  capacity  had  been  reached 
and  that  a  sharp  decline  in  listening  comprehension  had  begun. 
This  further  suggested  that  the  performance  of  the  3  bit  level 
of  psychomotor  activity  was  responsible  because  there  was  no 
evidence  at  other  levels  of  such  a  marked  reduction. 


112 


Although  the  analysis  did  not  indicate  that  the  infor- 
mation loads  imposed  by  the  dual  task  activity  were  unaccept- 
able, this  result  must  be  interpreted  in  light  of  the  fact  that 
pause  time  was  provided.   The  analysis,  particularly  of  the  per- 
formance by  Group  VIII,  suggests  that  without  pause  time  the 
probability  of  the  dual  task  activity  information  loads  becoming 
unacceptable  would  have  been  more  probable.   The  lowering  of 
comprehension  as  indicated  by  the  mean  score  of  24.733,  which 
is  the  lowest  for  all  of  the  groups,  indicates  that  Group  VIII 
was  operating  at  or  near  channel  capacity  in  respect  to  listen- 
ing comprehension.   The  data  suggested  that  performance  of  the 
psychomotor  activity  was  beginning  to  interfere  with  processing 
of  the  speech  input.   Without  the  provision  of  pause  time  to  per- 
form the  psychomotor  activity  a  further  reduction  would  have  been 
very  probable.   It  is  also  suggested  that  even  with  pause  time, 
comprehension  would  decline  further  with  higher  psychomotor 
activity  levels  or  listening  rates.   Competition  for  short-term 
memory  and  channel  switching  would  also  have  become  more  critical. 
The  results  tend  to  agree  with  findings  by  Overmann  (1969)  that 
processing  demands  increase  with  word  rates. 

The  data  suggested  that  except  for  Group  VIII  (275  WPM 
and  three  bit  Psychomotor  activity  level)  the  groups  were  opera- 
ting at  or  below  channel  capacity  in  respect  to  comprehension. 
The  data  suggested  that  Group  VIII  had  begun  to  exceed  channel 
capacity  because  of  the  combined  processing  demands  of  listening 
at  275  WPM  and  performing  three  bits  of  psychomotor  activity. 

The  general  logarithmic  increase  in  the  number  of  psycho- 
motor activity  sequences  performed  incorrectly  appeared  to  be  a 
result  of  the  information  processing  demands  of  the  individual 
psychomotor  activity  levels  because  of  the  lack  of  interaction 
with  the  listening  rates  as  indicated  by  the  analysis  of  variance. 

CONCLUSIONS 
The  following  conclusions  are  drawn  from  the  analysis: 

1.  The  performance  of  complex  psychomotor  activity,  up 
to  a  level  of  three  bits  per  sequence  will  not  significantly 
effect  listening  comprehension  at  250  and  275  words  per  minute 
provided  adequate  pause  time  is  provided. 

2.  The  ability  to  correctly  perform  up  to  three  bits  of 
psychomotor  activity  will  not  be  significantly  effected  by  lis- 
tening to  compressed  speech  at  rates  of  250  and  275  words  per 
minute  provided  adequate  pause  time  is  provided. 

3.  That  performance  of  psychomotor  activity  sequences, 

up  to  a  level  of  three  bits  per  sequence,  between  speech  elements, 
does  not  significantly  interfere  with  information  processing 
related  to  speech  reception. 

The  above  conclusions  are  further  predicated  on  the 

113 


assumption  that  the  number  of  psychomotor  sequences  or  tasks 
would  be  kept  to  a  reasonable  number.   In  this  study  psycho- 
motor sequences  were  performed  every  9.2  seconds  at  250  WPM  and 
every  8.8  seconds  at  275  WPM. 

The  following  implications  are  drawn  from  the  analysis. 
Generalization  beyond  the  type  of  population  sampled  in  the 
study  is  not  implied: 

1.  Students  will  be  able  to  listen  to  compressed  speech 
at  250  and  275  words  per  minute  and  also  perform  simple,  dis- 
crete, concurrent  psychomotor  activity,  such  as  manipulation  of 
viewers,  recorders,  and  projectors  without  unacceptable  degrad- 
ation to  comprehension  provided  adequate  pause  time  is  provided. 

2.  Students  will  be  able  to  listen  to  compressed  speech 
and  also  perform  complex,  concurrent  psychomotor  activity,  such 
as  switching  perceptual  channels,  observing  visuals,  copying 
data  and  making  overt  responses  without  degradation  to  compre- 
hension provided  adequate  pause  time  is  provided.   This  state- 
ment is  made  only  in  respect  to  psychomotor  activity  levels  up 
to  and  including  three  bits  of  activity. 

3.  If  students  are  expected  to  accomplish  complex  psycho- 
motor activity  rapidly,  in  conjunction  with  compressed  speech 
listening,  with  only  minimal  pause  time  provided,  some  deterior- 
ation in  performance  of  the  psychomotor  activity  should  be 
expected,  particularly  at  higher  listening  rates.   This  state- 
ment is  made  in  respect  to  psychomotor  activity  levels  up  to 

and  including  three  bits  of  activity. 

RECOMMENDATIONS  FOR  FURTHER  STUDY 

Because  previous  studies  had  not  been  conducted  in  this 
specific  area,  this  study  of  necessity  proceeded  in  an  evolu- 
tionary mode  based  on  the  theoretical  framework.   The  perform- 
ance of  subjects,  in  each  study,  exceeded  the  expectations  sug- 
gested by  the  theoretical  framework,  particularly  in  respect  to 
accomplishment  of  psychomotor  activity.   The  theoretical  frame- 
work and  this  study  suggest  the  possibility  for  numerous  addi- 
tional studies,  particularly  studies  examining  higher  listening 
rates  and  psychomotor  activity  levels. 


114 


BIBLIOGRAPHY 


Broadbent ,  D.  E.   Perception  and  communication.   New  York: 
Pergamon  Press~|  1958 . 

Cherry,  E.  Colin  &  Taylor,  W.  K.   Some  Further  Experiments 

Upon  the  Recognition  of  Speech  with  one  and  with  Two  Ears. 
The  Journal  of  the  Acoustical  Society  of  America,  July 
1954,  26,  554-559. 

Deutsch,  J.  A. ,  &  Deutsch,  D.   Attention:   Some  Theoretical 
Considerations.   Psychological  Review,  1963,  70,  80-90. 

Foulke,  Emerson,  Amster,  C.  H. ,  Nolan,  C.  Y. ,  &  Bixler,  R.  H. 
The  Comprehension  of  Rapid  Speech  by  the  Blind.   Excep- 
tional Children,  1962,  29,  134-141. 

Foulke,  Emerson,  &  Sticht,  Thomas  G.   The  Intelligibility  and 
Comprehension  of  Time  Compressed  Speech.   Proceedings  of 
the  Louisville  Conference  on  Time  Compressed  Speech" 
Louisville,  Kentucky:   Center  For  Rate  Controlled  Record- 
ings ,  University  of  Louisville,  1967. 

Garner,  W.  R.   An  Information  Analysis  of  Absolute  Judgment  of 
Loudness.   Journal  of  Experimental  Psychology,  1953,  46, 
373-380. 

Garner,  W.  R.   Uncertainty  and  structure  as  psychological  con- 
cepts .   New  York:   John  Wiley,  1962 . 

Hsia,  H.  J.   The  Information  Processing  Capacity  of  Modality 

and  Channel  Performance.   AV  Communications  Review.   Spring 
1971,  19(1),  51-75. 

Ludrick,  John  A.   A  Study  of  the  Effects  of  Controlled  Delivery 
Instructions  Upon  the  Achievement  of  College  Students  Us- 
ing Compressed  Speech  Audio  and  Television  Pictorials. 
Unpublished  Doctoral  Dissertation,  University  of  Oklahoma, 
Norman,  Oklahoma,  1974. 

Miller,  George  A.   The  Magical  Number  Seven,  Plus  or  Minus  Two: 
Some  Limits  on  Our  Capacity  for  Processing  Information. 
The  Psychological  Review,  March,  1956,  63,  81-96. 

Moray,  N.   Attention:   Selective  processes  in  vision  and  hear- 
ing.  New  York:   Academic  Press,  1970. 


115 


Norman,  D.  A.   Toward  a  Theory  of  Memory  and  Attention. 
Psychological  Review,  1968,  75,  522-536. 

Overmann,  R.  A.   Processing  Time  as  a  Variable  in  the  Compre- 
hension of  Time-Compressed  Speech.   Proceedings  of  the 
Second  Louisville  Conference  on  Rate  and/or  Frequency- 
Controlled  Speech.   Louisville,  Kentucky:   Center  For  Rate 
Controlled  Recordings,  University  of  Louisville,  1969. 

Peters,  D.  L.   Effects  of  Note  Taking  and  Rate  of  Presentation 
on  Short-Term  Objective  Test  Performance.   Journal  of 
Educational  Psychology,  1972,  63,  276-280. 

Pollack,  I.   The  Information  of  Elementary  Audio  Displays. 

Journal  of  Acoustical  Society  of  America,  1952,  24,  745- 
750.  ~~ 

Pollack,  I.,  &  Fricks ,  L.   Information  of  Elementary  Multi- 
dimensional Auditory  Displays.   Journal  of  Acoustical 
Society  of  America,  1954,  26,  155-158. 

Quastler,  H. ,  &  Wulff,  V.  J.  Human  Performance  in  Information 
Transmission.  The  University  of  Illinois  Report  No.  R62 . 
Urbana,  Illinois:   Control  System  Laboratory,  1955. 

Reid,  I.,  &  Travers,  R.  M.  W.   Time  Required  to  Switch  Atten- 
tion.  American  Educational  Research  Journal,  1968,  5, 
203-211. 

Shannon,  C.  E.   Prediction  and  Entropy  of  Printed  English. 
Bell  System  Technical  Journal,  1951,  30,  50-64. 

Treisman,  A.  M.  Strategies  and  Models  of  Selective  Attention. 
Psychological  Review,  1969,  76(3),  282-299. 

Travers,  Robert  M.  W.   Man's  information  system.   Scranton,  Pa. 
Chandler  Pub.  Co. ,  1970. 


116 


APPENDIX  A 

Psychomotor  Activity 
Stimulus/Response  Code  Groups 


Level 

Stimulus  Group 

Response  Group 

Memory  Aid 

1  bit 

0 

SUB 

Subtract 

1 

ADD 

Add 

2  bits 

00 

SUB 

Subtract 

01 

DIV 

Divide 

11 

ADD 

Add 

10 

MUL 

Multiply 

3  bits 

000 

SUB 

Subtract 

001 

DIV 

Divide 

011 

WRI 

Write 

111 

ADD 

Add 

110 

LOO 

Loop 

100 

MUL 

Multiply 

101 

CLO 

Close 

010 

OPN 

Open 

117 


APPENDIX  B 


SLIDE  FORMAT 


1  Bit  Level 


2  Bit  Level 


00 

01 

3  Bit  Level 


000 

001 

118 


Normal  Hearing  Children's  Intelligibility  of  Time- Compressed 

Words 

by  Shoup,    J.,    Beasley,    D.  S.  ,    Maki,    J.  E.  ,    &  Bess,    F. 


ABSTRACT 

Recent  research  findings  with  adults  has  suggested  that  time- 
compressed  speech  may  be  useful  detecting  and  treating  auditory 
perceptual  dysfunctions.  The  purpose  of  the  present  investi- 
gation was  to  extend  this  position  to  potential  use  with 
children.  Three  time-compressed  versions  of  an  auditory  dis- 
crimination measure  were  presented  to  90  normal-hearing 
children  ranging  in  age  from  four  years  -  six  months,  to  eight 
years  -  six  months.  The  stimuli  were  presented  monaural ly  via 
earphones,  at  sensation  levels  of  16  and  32  dB  to  an  equal  number 
of  right  and  left  ears.  Results  showed  that  intelligibility 
decreased  as  a  function  of  increasing  time  compression  and 
decreasing  age  and  sensation  level.  Ear  and  list  differences 
were  negligible.  The  mean  scores  were  comparable  to  an 
earlier  study  of  time-compressed  speech  which  used  children 
as  listeners. 


119 


TITLE:  NORMAL  HEARING  CHILDREN'S  INTELLIGIBILITY  OF  TIME 
COMPRESSED  WORDS 


AUTHORS:  Jane  Shoup,  M.A. 

Speech,  Language  and  Hearing  Clinician 
Lebanon  Community  School  Corporation 
Lebanon,  Indiana  46052 


Daniel  S.  Beasley,  Ph.D. 

Associate  Professor  and  Acting  Assistant  Dean 

Audiology  and  Speech  Sciences 

College  of  Communication  Arts  and  Sciences 

Michigan  State  University 

East  Lansing,  Michigan  48824 


Jean  E.  Maki,  Ph.D. 

Speech  Scientist 

National  Technical  Institute  for  the  Deaf 

Rochester,  New  York  14623 


Fred  Bess,  Ph.D. 
Associate  Professor 
Director  of  the  Hearing  Clinic 
Central  Michigan  University 
Mt.  Pleasant,  Michigan  48858 


120 


The  effect  of  time-altered  speech  on  the  auditory  processing 
performance  of  normal  hearing  as  well  as  hearing  impaired  and  brain 
damaged  adult  populations  has  received  considerable  attention  in 
recent  years,   (Beasley  and  Maki,  1976).  The  results  of  these  studies 
have  shown  that  time  compressed  speech  stimuli  can  be  an  effective 
diagnostic  tool  to  evaluate  the  auditory  perceptual  abilities  of 
these  populations.  A  paucity  of  research,  however,  is  available 
to  indicate  how  populations  of  children  perform  on  standard  speech 
discrimination  measures. 

Beasley,  Maki  and  Orchik  (1975)  investigated  the  effect  of  time 
compression  on  normal  hearing  children's  perception  of  speech  using 
the  Word  Intelligibility  by  Picture  Identification  (WIPI)  (Ross  and  Ler- 
man,  1975)  speech  discrimination  measure  presented  in  a  sound-field 
listening  situation.  The  WIPI  consists  of  twenty-five  six-picture 
plates  with  four  of  the  pictures  on  each  plate  used  as  test  stimuli. 
The  child  is  required  to  give  a  picture-pointing  response  when  a 
word  is  presented  auditorially.  Results  of  the  study  by  Beasley 
et  al .  showed  that  speech  discrimination  ability  of  time  compressed 
speech  signals  increased  as  age  and  sensation  level  increased. 
Discrimination  ability  was  found  to  decrease  as  the  percentage  of 
time  compression  increased.  The  design  of  the  present  investigation 
was  similar  to  that  of  Beasley  et  al.'s  earlier  study,  except  that 
subjects  received  the  test  stimuli  under  earphones  rather  than  via 
soundfield. 


121 


Methods  and  Procedures 

The  subjects  of  the  present  study  were  ninety  normal  hearing 
children  divided  into  three  age  groups.  Group  I  consisted  of 
subjects  aged  3  years-6  months  to  4  years-6  months,  whereas  Group  II 
subjects  were  5  years-6  months  to  6  years-6  months,  and  Group  III 
subjects  were  7  years-6  months  to  8  years-6  months.  In  order  to 
qualify  as  a  subject  for  the  study,  each  child  was  required  to  pass 
a  bilateral  pure-tone  air-conduction  hearing  screening  test  at  20  dB 
(re:  ISO  1964),  using  octave  intervals  of  125  through  8000  Hz.  A 
speech  reception  threshold  (SRT)  for  each  child  was  determined  for 
the  test  ear  as  a  reference  level  for  test  stimuli  presentations. 

The  four  lists  of  the  WIPI  speech  discrimination  measure  were 
recorded  and  used  as  the  test  stimuli  for  the  experimental  tapes. 
The  speaker  for  the  WIPI  was  a  male,  General  American  speaker  (DB) 
who  exhibited  a  normal  fundamental  frequency  and  speaking  rate.  The 
three  time  compressed  conditions  of  0%,  30%  and  60%  were  made  of 
each  of  the  four  WIPI  lists,  using  the  Lexicon  Vari speech  I  time 
compressor,  (Lee,  1972).  The  stimuli  were  monaural ly  presented 
to  each  subject  individually  via  earphones  in  a  two-room  sound 
treated  audiometric  testing  suite,  using  high-quality  recording  and 
listening  apparatus  which  was  routinely  checked  for  calibration. 

Each  subject  received  a  total  of  four  lists  of  the  WIPI,  one  list  at 
each  sensation  level  of  16  dB  and  32  dB   (re.  SRT),  at  either  0%  and 
30%  time  compression  or  0%  and  60%  time  compression.  The  0%  time 
compressed  condition  was  always  presented  first  to  the  subject,  and 
subject  always  received  the  two  lists  under  the  32  dB  SL  condition 
prior  to  the  presentations  of  the  two  lists  at  16  dB  SL.  Further, 


122 


four  practice  items  preceded  the  presentation  of  each  of  the  test  lists. 
Order  of  presentation  of  the  four  test  lists  was  rotated  so  that  each 
list  of  the  WIPI  was  used  only  once  per  subject.  Subjects  were 
assigned  randomly  to  left  and  right  ear  conditions,  such  that  there 
was  an  equal  number  of  right  and  left  test  ears  per  group. 

During  the  presentation  of  the  WIPI  tapes,  the  examiner  remained 
in  the  room  with  the  subject  and  scored  each  response.  An  assistant 
controlled  the  tape  recorder,  stopping  the  presentations  temporarily 
if  the  subject  required  a  longer  time  to  respond  between  stimuli. 

Results 

The  mean  percentage  correct  scores  for  each  of  the  three  subject 
groups  under  each  condition  of  time  compression,  age  and  sensation 
level  were  computed.  Mean  scores  obtained  by  right  and  left  ears  were 
compared  as  were  scores  for  each  of  the  four  WIPI  lists.  Generally, 
the  results  of  the  study  showed  that  scores  decreased  as  a  function  of 
increasing  time  compression  and  decreasing  sensation  level  and  age 
(See  Table  1).  No  consistent  trends  relative  to  ear  differences  were 
observed.  Finally,  scores  on  List  IV  of  the  WIPI  were  somewhat  poorer 
than  the  scores  on  the  other  three  lists,  and,  further,  List  IV 
contained  nearly  double  the  number  of  stimulus  items  comprising  the 
10  most  frequently  missed  words,  compared  to  the  other  three  lists. 

As  indicated  in  Table  1,  the  mean  percent  correct  scores  for  all 
age  levels  decreased  under  both  sensation  levels  as  time  compression 
ratio  increased.  This  effect  was  most  evident  for  60%  time  compression 
and  16  dB  SL.  Further,  Table  1  shows  that  scores  increased  under  all 
three  time  compression  conditions  as  the  sensation  level  was  increased 
from  16  to  32  dB,  a  finding  also  noted  by  Beasley,  Maki  and  Orchik  (1975). 

123 


r—  o 


i. 

O) 

c 

<u 

cr> 

rO 

O) 

> 

O) 

<l> 

+-> 

S- 

u 

sz 

cu 

4-> 

Q_ 

00 

QJ 

CD 

-C 

s- 

-t-> 

-o 

S- 

c 

o 

UDl 

cy>  1 

ol 

CO 

CO! 

CT>[ 

cr,  1 

CO 
CO 

o 

ro 

<X> 

ir> 

CO 

CTl 

cr. 

en 

■o 

c 

c: 

c 

o 

o 

o 

<j 

4-> 

<T3 

.c 

03 

CTi 

u 

i/l 

A3 

£Z 

-M 

<u 

Ol 

t/> 

00 

OJ 

S- 

> 

o 

T3 

C 

M- 

"r~ 

00 

Ol 

C 

c 

i_ 

o 

<D 

o 

00 

o 

oo 

QJ 

00 

00 

S- 

01 

Q- 

4J 

S- 

<_> 

a. 

<U 

O) 

E 

jz: 

s_ 

o 

-M 

s- 

u 

o 

C 

o 

0J 
E 

'r~ 

T3 

c 

-l-J 

Ol 

<D 

oo 

u 

4- 

Z3 

s~ 

O 

cu 

CL 

c 

00 

o 

CL 

cz 

Z5 

03 

-M 

o 

124 


Beasley  et  al . ,  however,  found  that  there  was  a  major  improvement  in 
scores  as  a  function  of  sensation  level  at  60%  time  compression, 
whereas  the  results  of  the  present  study  indicated  that  score  improve- 
ment as  a  function  of  sensation  level  was  about  the  same  for  all 
levels  of  time  compression. 

The  effect  of  age  also  is  shown  in  Table  1.  As  indicated,  the 
mean  percentage  correct  scores  increased  for  both  sensation  levels 
and  all  time  compression  conditions  for  each  age  group.  Greatest 
score  differences  were  found  between  the  four  year  old  group  and  the 
six  year  old  group.  In  general,  scores  for  the  six  year  olds  were 
only  slightly  poorer  than  scores  for  the  eight  year  olds.  This  score 
trend  as  a  function  of  age,  suggests  that  younger  children  with  less 
language  experience  need  the  temporal  redundancy  of  non-compressed 
stimuli  for  maximum  intelligibility. 

The  mean  percent  correct  scores  obtained  in  this  study  were 
similar  to  those  of  Beasley  et  al . ,  but  were  somewhat  better  under 
the  30%  and  60%  time  compressed  conditions  at  16  dB  SL  and  at  60% 
time  compression  at  32  dB  SL.  The  improved  scores  of  this  study  were 
a  probable  result  of,  at  least  in  part,  the  procedure  in  which  the 
test  stimuli  were  presented  to  the  subjects.  In  addition  to  the 
use  of  earphones,  procedural  differences  included  presentation  of 
four  taped  practice  items  at  the  same  rate  of  compression  as  the 
associated  test  list,  thereby  allowing  the  subject  time  to  adjust  to 
the  time-compressed  condition  and  sensation  level  of  the  stimuli. 
In  addition,  each  subject  received  a  0%  time  compressed  condition 
prior  to  receiving  either  the  30%  or  60%  condition  at  each  sensation 
level,  and  all  subjects  received  two  32  dB  SL  conditions  prior  to 
receiving  any  lists  at  16  dB  SL.  These  procedures  resulted  in  the 


125 


easiest  conditions  being  presented  first  and  then  progressing  to  more 
difficult  conditions  at  a  lower  sensation  level.  Beasley,  Maki ,  and 
Orchik  did  not  hold  constant  the  sensation  level  or  time  compression 
rate  presented  initially  to  each  subject;  thus  the  most  difficult 
sensation  level  and  highest  time  compression  rate  were  sometimes  the 
first  condition  presented. 

In  addition  to  normal  hearing  children  as  subjects,  other 
populations  of  children,  such  as  the  learning  disabled,  brain  damaged 
and  hearing  impaired  should  be  included  as  subjects  of  future  studies 
to  provide  information  about  the  speech  discrimination  abilities  of 
these  populations  under  time  compression.  Such  information  would  likely 
have  diagnostic  as  well  as  therapeutic  value. 


126 


References 


Beasley,  D.,  and  Maki,  J.  Time-and  frequency-altered  speech. 
In  Lass,  N.,  Readings  in  Experimental  Phonetics.  Wiley  and 
Son  (1976). 

Beasley,  D.,  Maki,  J.,  and  Orchik,  D.  Children's  perception 
of  time-compressed  speech  on  two  measures  of  speech  discrimin- 
ation. J.  Speech  Hear.  Pis. ,  In  Press  (1976). 

Lee,  F.  Time  compression  and  expansion  of  speech  by  the  sampling 
methods.  J.  Audio.  Eng.  Soc,  20,  738-742  (1972). 

Ross,  M. ,  and  Lerman,  J.  A  Picture  Identification  Test  for 
Hearing-Impaired  Children.  J.  Speech  Hear.  Res. ,  13,  44-53  (1970) 


127 


The  Performance  of  Children  Who  Display  Auditory  Processing 
Disorders  on  a  Time-Compressed  Speech  Discrimination  Task 
by  Manning,    W.  H.  ,    Johnston,   K.  L.  ,    &  Beasley,    D.  S. 


129 


TITLE:  THE   PERFORMANCE    OF   CHILDREN   WHO   DISPLAY   AUDITORY  PROCESSING 

DISORDERS   ON   A  TIME-COMPRESSED  SPEECH   DISCRIMINATION  TASK 


AUTHORS:   Walter  H.  Manning 

Assistant  Professor 

Division  of  Speech  Pathology  and  Audiology 

University  of  Nebraska 

Lincoln,  Nebraska  68588 

Kathleen  L.  Johnston 
Diagnostician 

Educational  Service  Unit  4 
Auburn,  Nebraska  68305 

Daniel  S.  Beasley 

» 

Associate  Professor  and  Assistant  Chairman 
Audiology  and  Speech  Science  Department 
Michigan  State  University 
East  Lansing,  Michigan   48824 


ADDRESS  CORRESPONDENCE  TO  FIRST  AUTHOR  (MANNING) 


130 


THE  PERFORMANCE  OF  CHILDREN  WHO  DISPLAY  AUDITORY 

PROCESSING  DISORDERS  ON  A  TIME -COMPRESSED 

SPEECH  DISCRIMINATION  TASK 

Walter  H.    Manning,    Ph.D.  Kathleen  L.    Johnston,    M.A. 

DanielS.    Beasley,    Ph.D. 

ABSTRACT 

The  speech  discrimination  of  20  elementary  school  children  displaying 
auditory  perceptual  disorders  was  investigated  using  time-compressed 
stimuli.     Children  were  evaluated  with  a  battery  of  diagnostic  speech  and 
language  assessments  and  demonstrated  subaverage  performance  by  one 
year  or  more  on  a  minimum  of  two  auditory  processing  tasks.      Profiles  for 
the  subjects  were  typical  of  auditory  modality  learning  disabled  (not  mentally 
retarded)  children. 

The  stimuli  consisted  of  three  tape  recorded  lists  of  the  Phonetically 
Balanced  Kindergarten  (PKB-50)  words.     Each  list  was  time-compressed  by 
0%,    30%,    and  60%.     Following  a  pure  tone  bilateral  hearing  screening  and 
the  establishment  of  each  subject's  speech  reception  threshold,    each  subject 
received  one  of  the  three  PBK-50  lists  at  one  of  the  three  levels  of  time 
compression.     Subjects  received  the  lists  and  the  time  compression  conditions 
in  a  counterbalanced  order.     Lists  were  presented  at  32  dB  SL  via  soundfield 
conditions  in  a  sound  treated  suite. 

Results  indicated  that  the  children  performed  nearly  the  same  at  the 
0%  and  30%  time  compression  conditions  (84.  9  and  85.  4  mean  percent  correct, 
respectively).     Scores  at  the  60%  condition  were  significantly  poorer  (57.  3 
mean  percent  correct).     Comparison  of  these  results  with  normative  data 
indicated  that  the  children  displaying  auditory  processing  disorders  performed 
poorer  on  both  0%  and  60%  time  compression  conditions  than  did  the  normal 
children.     However,    the  children  with  auditory  processing  difficulties  per- 
formed essentially  the  same  as  the  normal  children  on  speech  stimuli  com- 
pressed by  30%  (85.  4  and  86.  4  mean  percent  correct,    respectively).      The 
relatively  better  performance  by  the  subjects  in  the  present  study  on  the 
speech  stimuli  time  compressed  by  30%  may  have  resulted  from  a  neutralizing 
effect  for  the  rapid  decay  of  stimuli  often  associated  with  the  initial  stages 
of  the  short-term  memory  system  of  these  children. 


131 


Time  compressed  speech  discrimination  measures  have  been  used  increasingly 
during  recent  years  as  one  means  of  investigating  auditory  perceptual  functions 
of  the  central  nervous  system.   Several  investigations  have  provided  information 
concerning  the  usefulness  of  such  temporally  altered  stimuli  with  both  normal 
and  non-normal  adult  subjects.   Only  recently,  however,  have  data  become  avail- 
able concerning  the  performance  of  children  on  time  compressed  speech  discrimina- 
tion measures. 

For  example,  Beasley,  Maki  and  Orchik  (1975)  time  compressed  the  WIPI  and 
the  PB-K  50  speech  discrimination  measures  and  presented  them  at  two  sensation 
levels  to  60  children  divided  into  three  age  groups  of  20  each.   Results  showed 
that  average  intelligibility  scores  increased  as  a  function  of  increasing  age 
and  sensation  level  and  decreased  with  increasing  amounts  of  time  compression. 
Beasley  et  al.  suggested,  however,  that  in  order  for  temporally  distorted  stimuli 
to  be  used  validly  and  reliably  in  clinical  settings,  further  investigations 
employing  hearing-impaired  and  other  "auditorially  perceptually  impaired" 
children  be  initiated.   The  purpose  of  the  present  investigation  was  to  deter- 
mine the  performance  of  children  who  displayed  auditory  perceptual  disorders  on 
a  time  compressed  speech  discrimination  measure. 

METHOD 
Subjects 

Twenty  children  from  a  middle  socio-economic  environment  served  as  subjects. 
The  children  ranged  in  age  from  7  years,  6  months  to  8  years,  6  months  with  a 
mean  age  of  8  years,  1  month.   All  subjects  had  normal  hearing  bilaterally  as 
determined  by  a  pure-tone  audiometric  screening  test  administered  at  20  dB  at 
octave  intervals  from  12  5  to  8000  Hz.   Speech  reception  thresholds  were  obtained 


132 


using  Utley's   Children's    Spondees    (Utley,   1951)   which  were  tape   recorded  by 
a  trained  male   speaker  for  presentation  to  the   subjects.      The  mean   SRT   for 
the  subjects  was    7   dB   and  ranged  from  2  to   16  dB.      The   Denver  Articulation 
Screening  Exam   (Drumwright,   1971)   was   administered  to  each   child   in   order  to 
assess  possible   articulation  errors   which   might   affect   the   subjects'    responses 
during  testing. 

Subjects   selected   for  testing  were    children   who  had  been   reported  by 
their   classroom  teachers    as   displaying   learning   difficulties.      These   children 
had  also  been   formally   evaluated  by  experienced  graduate   students   in   speech 
pathology   and  audiology  with   a  battery  of  diagnostic   speech   and  language 
assessments.      The   children   manifested  irregular  profiles   across   abilities; 
that    is,   performance  was   appropriate   or  better  on   visual  tasks   and  below  age 
level  on   auditory  tasks.      For  the  purpose  of  this   study,   an   auditory   disorder 
was   defined  as  performance  of  one  year  below  chronological   age  norms   for  two 
or  more  tests   or  subtests   of  the  Full-Range  Action-Agent   Vocabulary  Test 
(Gesell,    1940)    and  the    Illinois  Test   of  Psycholinguistic  Abilities   (Kirk, 
McCarthy,   and  Kirk   1968).      Subtests   from  the   ITPA  on  which  the  subjects  were 
tested   included:    Auditory  Reception,  Auditory  Association,   Auditory   Sequential 
Memory,   Grammatic   Closure,    Auditory   Closure,   and   Sound  Blending. 
Experimental  Procedures 

The   stimuli   used  were   copies   of  the   three  PB-K   lists   used  by  Beasley, 
Maki  and   Orchik.      The   three   PB-K   50    lists  had  been  time -compressed  to   30% 
and   60%  of  original   time,    using   a   Lexicon   Varispeech    I    (Lee,    1971)   time    com- 
pressor.     A   control   condition  of  0%  time   compression  was   also  used. 


133 


The  three  time  compression  conditions  were  presented  to  each  of  the  20 
subjects  in  a  counterbalanced  order  with  List  1  always  presented  first,  List  2 
second,  and  List  3  last.   All  tapes  were  presented  at  32  dB  sensation  level 
re:  the  speech  reception  threshold  of  each  individual  subject. 

All  testing  took  place  in  a  sound  treated  suite  with  a  high  quality 
talk-back  system.   The  stimuli  were  presented  via  sound-field  from  a  high 
quality  tape  recorder  to  a  speech  audiometer  coupled  to  a  loudspeaker  located 
in  the  listening  room.   Calibration  tones  of  individual  lists  were  adjusted 
so  that  a  reading  of  0  dB  VU  was  noted  on  the  speech  audiometer.   The  sound 
field  was  calibrated  to  13  dB  SPL  for  speech  spectrum  noise  in  the  manner 
described  by  Tillman,  Johnson,  and  Olsen  (1966)  and  was  checked  prior  to 
testing  each  subject.   During  the  testing  sessions,  the  listener  was  seated 
three  feet  from  and  directly  in  front  of  the  loudspeaker.   The  standard  set 
of  instructions  associated  with  the  PB-K  50  measure  was  administered.   Visual 
as  well  as  auditory  cues  were  used  by  the  examiner  in  determining  subject 
responses. 

Subject  fatigue  and  lack  of  attention  during  the  administration  of  the 
three  50-word  lists  was  anticipated  as  a  possible  problem  with  this  population 
of  children.   In  order  to  deal  with  this  possibility,  only  the  first  half 
(25  words)  of  each  of  the  three  lists  were  administered.   Previous  analysis 
of  the  scores  obtained  by  Beasley,  et  al.  (Manning,  Shaw,  Maki  and  Beasley, 
1975)  had  indicated  that  half  and  whole-list  scores  were  equivalent. 


134 


RESULTS 
Tabulation  of  mean  percent  correct  scores  across  the  three  levels  of 
time  compression  and  the  three  PB-K  lists  indicated  that  children  displaying 
auditory  perceptual  disorders  to  the  present  investigation  demonstrated  a  non- 
linear decrease  in  performance  as  the  amount  of  time  compression  increased. 


Table  1.   Mean  percent  correct  scores,  standard  deviations,  and  ranges 
for  each  condition  of  time  compression  (0%,  30%  and  60)  for 
the  PB-K  50  stimuli  presented  at  32  dB  SL. 


TIME  COMPRESSION 
0%        30%         60%        Total 


Mean  Score 
Standard  Deviation 
Range 


84.9      85.4 

6.9       6.9 

68  -  69    68  -  96 


57.3        75.9 

14.7        16.6 

24  -  80      24  -  96 


135 


That  is,  these  children  performed  nearly  the  same  on  the  0%  and  30%  time  com- 
pression conditions  (84.9  and  85.4  percent  correct,  respectively).   This 
ability  to  perform  equally  well  on  both  of  these  conditions  was  also  demon- 
strated by  the  fact  that  14  of  the  20  children  tested  achieved  the  same  or 
better  scores  at  the  30%  time  compression  condition  as  they  received  at  the 
0%  time  compression  condition.   The  mean  percent  correct  score  for  subjects 
on  the  60%  time  compression  condition  (57.3  percent)  indicated  a  pronounced 
decrease  in  performance  from  30  to  60  percent  time  compression.   Results  of 
a  t_  test  for  related  samples,  comparing  individual  subject  scores  on  the  30% 
and  60%  time  compression  conditions,  indicated  that  the  children  performed 
significantly  better  (P<.  05)  at  the  30%  time  compression  condition. 

Regardless  of  the  fact  that  List  1  was  always  presented  first,  List  2 
second,  and  List  3  last,  the  percent  correct  scores  associated  with  these 
three  lists,  averaged  across  the  three  levels  of  time  compression,  were 
essentially  identical  (76.2,  74.2,  and  76.3  percent  correct  respectively). 
While  these  lists  were  designed  to  be  equivalent  for  use  in  intelligibility 
testing,  there  was  some  question  as  to  whether  such  equivalency  would  be 
demonstrated  across  various  levels  of  time  compression.   Although  the 
children  in  the  present  study  were  presented  with  only  twenty-five  of  the 
fifty  words  in  each  list,  results  suggest  that  first  half-list  scores  are 
equivalent  across  the  three  levels  of  time  compression  considered. 


136 


Xi   X2 

o   o 
rd 


> 

W  01 

o  c 

o  o 

w  -h 

■P  rd 

O  CO 


O  O 

k    -P  U-> 
CD     (fl 

O.  X 

C  I 
C    OH 

rd  -h  a- 

cd    CO 

E     CO  CD 

CD  ,C 

^     Ch  +J 
O     CX 

S   §  § 


O  <4n 
O    O 


I         I         I         I         I         I         I         I         I         I 

ooo  o  oo  oo  o  o 


o 

o 

o 

o 

o 

o 

o 

o 

o 

o> 

CO 

K 

«o 

f- 
Z 
UJ 

O 
QC 
UJ 

a. 

(J 

UJ 

ec 

DC 

O 

o 

*r 

n 

a      J         & 
°       <  bO 


i-  >- 

z     - 


137 


As  is  shown  in  Figure  1,  while  performance  of  the  normal  children  studied 
by  Beasley  et_  al.  decreased  in  a  linear  fashion  as  percent  of  time  compression 
was  increased,  this  was  not  the  case  for  the  eight  year  old  children  with  audi- 
tory perceptual  disorders.   Rather,  the  children  in  the  present  study  performed 
in  a  nearly  identical  fashion  on  both  the  0%  and  30%  time  compression  conditions. 
Thus,  although  the  children  with  auditory  perceptual  problems  did  demonstrate 
consistently  poorer  performance  than  normal  children  at  the  0%  time  compression 
condition  (84.9  and  92.2  percent  correct,  respectively)  and  at  the  60%  time 
compression  condition  (57.3  and  68.6  percent  correct,  respectively),  these 
auditorially  impaired  children  performed  the  same  as  the  normal  children  at 
the  30%  time  compression  condition  (85.4  and  85.2  percent  correct,  respectively). 

Results  of  previous  studies  have  suggested  that  0%  time  compressions 
of  speech  resulted  in  optimal  performance  by  children  and  adults.   In  the 
present  study,  however,  time  compression  of  speech  stimuli  up  to  30%  resulted 
in  optimal  processing  for  the  children  with  auditory  perceptual  problems  used 
in  the  present  investigation.   These  children  were  able  to  identify  the  stimuli 
equally  well  whether  the  stimuli  were  presented  at  an  essentially  normal  rate 
(0%  time  compression)  or  a  rate  some  30%  faster  than  normal. 

DISCUSSION 

While  presentation  of  the  stimuli  at  0%  and  60%  time  compression  resulted 
in  less  than  normal  performance  for  the  subjects  in  the  present  study,  these 
children  performed  equally  as  well  as  normal  children  when  the  stimuli  were 
time  compressed  by  30%.   It  has  been  suggested  that  children  with  auditory 
perceptual  problems  often  manifest  short-term  memory  disabilities.   It  has  also 
been  hypothesized  that  the  short-term  memory  system  of  normal  individuals  is 


138 


characterized  by  a  rapid  decay  of  information  from  the  initial  stages  of  the 
system  (Aaronson,  1967;  Aaronson  and  Maskowitz,  1971).   Perhaps  children  with 
auditory  perceptual  problems  experience  an  excessively  rapid  decay  of  informa- 
tion from  their  short-term  memory  systems.   In  turn,  presentation  of  stimuli 
at  a  moderately  increased  rate  (for  example,  30%  time  compression),  while  not 
excessively  distorting  the  acoustic  characteristics  of  the  speech  signal,  may 
result  in  a  neutralizing  effect  for  the  rapid  decay  of  the  stimuli  from  the 
short-term  memory  system.   Perhaps,  as  has  also  been  suggested  in  several 
studies,  speech  stimuli  which  has  been  time  compressed  at  a  moderate  rate  may 
be  motivating  for  some  individuals.   It  has  also  been  suggested  that  individuals 
are  capable  of  processing  stimuli  at  rates  much  faster  than  normal  speaking 
rates.   They  suggested  that  when  speech  is  presented  at  a  rate  slower  than 
an  individual's  optimum  processing  rate,  extraneous  stimuli  are  apt  to  inter- 
fere with  the  processing  of  the  primary  signal.   This  possibility  may  be  par- 
ticularly relevant  for  children  with  certain  types  of  perceptual  problems  who 
often  display  characteristic  inattentiveness  or  inability  to  attend  to  certain 
tasks. 

While  the  results  of  the  present  study  suggest  that  time  compression  of 
individual  words  facilitates  the  short-term  memory  function  of  children  with 
auditory  processing  difficulties,  the  findings  are  not  conclusive.   These  re- 
sults do,  however,  indicate  that  such  a  hypothesis  merits  further  investigation. 
Thirty  percent  time  compression  of  the  speech  stimuli  employed  in  the  present 
study  appeared  to  assist  children  with  auditory  perceptual  problems  to  perform 
approximately  as  well  as  normals.   Perhaps  time  compression  of  stimuli  of 
phrase  or  sentence  length  would  result  in  even  greater  advantages  for  such  sub- 
jects. 

139 


REFERENCES 
Aaronson,  D. ,  Temporal  factors  in  perception  and  short  term  memory.   Psycho- 
logical  Bulletin,  67,  130-144  (1967). 

Aaronson,  D.  ,  Markowitz ,  M. ,  and  Shapiro,  H. ,  Perception  and  immediate  recall 

of  normal  and  "compressed"  auditory  sequences.   Percept .  and  Psychophysics , 
9,  338-344  (1971). 

Beasley,  D.  ,  Maki,  J.,  and  Orchik,  D.  ,  Children's  perception  of  time -compressed 
speech  on  two  measures  of  speech  discrimination.   J.  Speech  Hearing  Dis., 
In  Press  (1975). 

Drumwright,  A.  F. ,  Denver  Articulation  Screening  Exam,  LADOCA  Foundation, 
Denver,  Colorado  (1971). 

Gesell,  A.,  The  First  Five  Years  of  Life.   Harper  and  Rowe  (1940). 

Kirk,  S. ,  McCarthy,  J.  ,  and  Kirk,  W. ,  The  Illinois  Test  of  Psycholinguist ic 

Abilities.   Revised  edition;  Urbana:  University  of  Illinois  Press  (1968). 

Lee,  F. ,  Varispeech  I :  Instructional  Manual.   Lexicon,  Inc.,  Waltham,  Mass- 
chusetts  (1971). 

Manning,  W. ,  Shaw,  C. ,  Maki,  J.,  and  Beasley,  D. ,  Analysis  of  half-list  scores 
on  the  PB-K  as  a  function  of  time  compression  and  age.   J_.  American 
Audiology  Society,  (1975). 

Tillman  T. ,  Johnson,  R.  ,  and  Olsen,  W. ,  Earphones  versus  sound  field  threshold 
sound-pressure  levels  for  spondee  words.   J_.  Acoust .  Soc.  Amer.  ,  39, 
125-133  (1966). 


140 


Utley,   J.,   What's   Its  Name,   Urbana,   Illinois:    University  of  Illinois   (1951) 


141 


Effects  of  Time  Compression  Upon  Process  Variables  in 
Counseling  Dialog 
by  Schwab,    R. 


143 


EFFECTS   OF  TIME-COMPRESSION 
UPON  PROCESS   VARIABLES   IN  COUNSELING  DIALOGUE 


Reiko  Schwab,  Ed.D. 

Assistant  Professor 

of  Education 

Old  Dominion  University 

Norfolk,  Virginia 


Dr.  Reiko  Schwab  is  an  assistant  professor  of  education 
at  Old  Dominion  University,  Norfolk,  Virginia. 


144 


ABSTRACT 

The  study  investigated  the  effects  of  time-compression 
of  counseling  dialogue  upon  the  evaluation  of  four  therapeutic 
process  variables — empathy,  warmth,  genuineness,  and  depth 
of  self-exploration,  as  well  as  upon  the  content  comprehen- 
sion of  dialogue. 

The  experimental  tapes  consisted  of  five  different 
counseling  tapes  with  five  levels  of  compression,  0%,  20%, 
30%,  *K)%,  and  50%.  A  total  of  25  counselors  and  counselor 
educators,  serving  as  subjects,  listened  to  a  set  of  five 
tapes  and  evaluated  them  according  to  the  Truax  Tentative 
Scale  for  the  Measurement  of  Accurate  Empathy,  of  Nonpossessive 
Warmth,  of  Genuineness,  and  of  Depth  of  Self-Exploration. 
A  content  comprehension  test  prepared  for  each  tape  was  also 
administered  to  the  subjects. 

The  data  analyzed  by  a  two-way  analysis  of  variance  in- 
dicated a  significant  effect  of  compression  upon  the  evalua- 
tion of  empathy.   The  mean  empathy  value  was  particularly 
higher  at  40%  compression.   No  other  process  variables  were 
significantly  affected  by  compression.  Comprehension  showed 
a  sharp  decline  at  50%  compression,  which  was  responsible  for 
a  significant  F  ratio. 

The  results  suggest  that  a  moderate  amount  of  compression 
can  be  applied  to  a  counseling  dialogue  without  affecting 
either  the  evaluation  of  the  selected  process  variables  or 
the  content  comprehension. 


145 


Effects  of  Time-Compression 
Upon  Process  Variables  in  Counseling  Dialogue 

The  purpose  of  the  study  was  to  determine  whether  counsel- 
ing dialogue  was  amenable  to  compression  without  significantly 
affecting  either  the  content  comprehension  or  the  evaluation 
of  variables  deemed  critical  in  the  therapeutic  process — 
accurate  empathy,  warmth,  genuineness,  and  depth  of  self- 
exploration  (Rogers,  1961;  Truax  &  Carkhuff,  1969) .   Previous 
studies  on  compressed  speech  have  dealt  mainly  with  the  in- 
telligibility and  comprehensibility  of  compressed  recordings 
of  material  read  by  a  trained  reader  (Fairbanks,  et  al.,  1957i 
Foulke,  et  al.,  1962;  Friedman,  et  al.,  1967;  Goldhaber,  1970; 
Miller  &  Licklider,  1950;  Orr  &  Friedman,  1968;  Sticht,  1968, 
1970).   The  findings  have  indicated  that  compression  of  ap- 
proximately 250  to  300wpm  can  be  tolerated  without  significant 
loss  in  comprehension.   Little  is  known,  however,  about  the 
effects  of  compression  not  only  upon  the  comprehension  of 
dialogue  but  also  upon  the  assessment  of  affective  process 
variables  which  are  often  prominent  in  dialogue.   The  present 
study  is  believed  to  have  been  the  first  attempt  to  investi- 
gate the  application  of  time-compression  to  dialogue. 

Design  and  Procedures 

Subjects 

A  total  of  25  counselors  and  counselor  educators,  16 


146 


males  and  9  females  selected  on  the  basis  of  their  professional 
qualifications  and  experience  as  counselors  and  counselor 
educators  served  as  subjects  for  the  study.   No  subjects  had 
previous  experience  with  compressed  speech. 

Experimental  Tapes 

Five  counseling  tapes  used  in  the  study  were  chosen  from 
among  many  tapes  on  the  basis  of  several  criteria i  male 
counselor  and  male  client  combinations  to  control  for  variance 
due  to  sex  differences,  the  style  and  quality  of  the  counsel- 
ing interaction  to  introduce  variation  in  the  quality  of  the 
counselor-client  verbal  exchange,  the  acoustic  quality  of  the 
recording  and  reasonably  good  articulation  on  the  part  of  the 
counselor  and  client  to  minimize  nuisance  variable  effects. 

The  five  30-minute  tapes  were  each  compressed  by  20#,  30$, 
40#,  and  50%   through  the  use  of  a  speech  compressor,  the 
Whirling  Dervish.  The  approximate  speed  of  each  counseling 
dialogue  was  estimated  by  averaging  the  numbers  of  words  and 
syllables  in  five  randomly  selected  one-minute  periods  of  the 
counseling  session.  Th«  original  word  rates  of  the  five  tapes 
varied  from  143  to  175wpra  and  their  syllable  rates  ranged 
from  175  to  218spm. 

In  order  to  acquaint  subjects  with  the  phenomenon  of 
compressed  dialogue,  a  sample  tape  was  made  which  presented 
a  3-roinute  dialogue  followed  by  its  compressed  versions.   The 
degree  of  compression  was  not  indicated  in  the  sample  tape. 


147 


Instruments 

Counselor's  empathy,  warmth,  genuineness,  and  client's 
depth  of  self-exploration  were  measured  through  the  use  of 
the  Truax  Tentative  Scale  for  the  Measurement  of  Accurate 
Empathy,  of  Nonpossessive  Warmth,  of  Genuineness,  and  of 
Depth  of  Self-Exploration  (Truax  &   Carkhuf f ,  1969) . 

Content  comprehension  tests  for  the  five  tapes,  each 
consisting  of  30  true-false  questions,  were  constructed  by 
the  investigator  and  pretested  on  five  groups  of  four  to  six 
counselor  trainees  who  listened  to  the  tapes.   It  was  ex- 
pected that  upon  careful  listening  at  the  original  speed, 
subjects  would  respond  to  the  items  with  better  than  90# 
accuracy. 

Procedures 

The  assignments  of  subjects  to  groups  and  groups  to 
treatments  (each  treatment  being  a  set  of  five  tapes)  were 
made  randomly.   Each  set  of  tapes  consisted  of  five  different 
counseling  tapes  with  differing  degrees  of  compression,  0%, 
20%,    30%,  k>0%,   and  $0%.      The  subjects  were  asked  to  listen 
to  the  sample  tape  once  to  become  familiar  with  the  phenomenon 
of  speech  compression  and  then  to  listen  to  each  one  of  the 
five  experimental  tapes  only  once.   The  degrees  of  compression 
of  the  tapes  were  not  disclosed  to  prevent  possible  biased 
responses.   No  subject  listened  to  the  same  tape  twice,  which 
eliminated  possible  contamination  from  previous  exposure  to 
a  tape.   All  the  subjects,  however,  listened  to  all  the  tapes 


148 


at  some  level  of  compression  and  always  heard  them  in  an 
order  of  increasing  degrees  of  compression. 

After  listening  to  each  experimental  tape,  the  subjects 
evaluated  the  tape  according  to  the  four  Truax  scales  and 
took  the  content  comprehension  test.   Questionnaires  were 
also  administered  to  the  subjects  in  order  to  obtain  their 
opinions  as  to  the  desirability  of  the  speeded  presentation 
and  the  acoustic  qualities  of  the  compressed  tapes. 

For  the  purpose  of  establishing  a  baseline  for  each  tape 
and  providing  a  measure  of  validation  for  the  subjects' 
evaluation  of  the  tapes,  three  judges  selected  for  the  study 
also  listened  to  the  five  counseling  tapes  at  the  original 
noncompressed  speed  and  evaluated  them  according  to  the  four 
Truax  scales.   The  qualifications  for  the  judges  were  defined 
in  terns  of  their  professional  training  at  the  doctoral  level, 
experience  and  active  engagement  in  counselor  education  or 
the  practice  of  counseling. 

On  the  basis  of  the  judges'  evaluation,  the  tapes  were 
rank-ordered  and  the  rank-order  was  later  correlated  with  the 
rank-order  derived  from  the  subjects'  evaluation  of  tapes. 

Analysis  of  Data 

The  data  were  analyzed  by  a  two-way  analysis  of  variance. 
Five  levels  of  compression  and  five  different  counseling  tapes 
were  independent  variables.   The  dependent  variables  were 
scores  obtained  through  the  four  Truax  scales  and  the  results 
of  a  comprehension  test  of  the  verbal  content  of  each  counsel- 
ing dialogue. 

149 


Results 

Means  and  standard  deviations  of  values  obtained  for 
each  process  variable  and  of  content  comprehension  scores  at 
differing  levels  of  compression  along  with  those  for  five 
different  counseling  tapes  are  shown  in  Table  1  and  2. 


A  central  question  in  the  study  was  whether  time-com- 
pression of  counseling  dialogue  would  significantly  affect 
the  evaluation  of  accurate  empathy,  warmth,  genuineness,  and 
depth  of  self-exploration.  The  results  indicated  a  significant 
effect  of  compression,  at  the  .05  level,  upon  the  subjects* 
evaluation  of  accurate  empathy,  but  compression  did  not 
significantly  affect  the  subjects*  evaluation  of  warmth, 
genuineness,  or  depth  of  self-exploration. 

The  main  effect  of  tapes  upon  the  evaluation  of  the  four 
process  variables  were  found  to  be  significant  at  the  .001 
level.  In  other  words,  the  five  counseling  tapes  were 
significantly  different  in  the  values  of  the  four  process 
variables  tested. 

Compression  was  found  to  affect  the  comprehension  of  the 
counseling  dialogue  at  the  ,001  level  of  significance.  The 
main  effect  of  tapes  upon  content  comprehension,  however, 
was  not  significant,  which  means  that  the  five  tapes  were  not 
significantly  different  in  their  comprehensibility. 


150 


DO 

h 

o 

o 

en 

so 

c 

rH 

o 

0 

<H 

H-t 

> 

o 

(0 

« 

c 

^J 

n 

t> 

c 

£ 

c 

O 

«> 

O 

•H 

h 

•H 

"P 

ex 

n 

•H 

i 

n 

o 

> 

o 

h 

• 

P, 

r-l              Q 

T3 

c 

1 

t>              T3 

•3 

o 

r-4                k 

X)            ts 

CO 

c 

•j               TJ 

« 

> 

e-«          c 

3 

-H 

« 

rH 

fe 

■p 

<ti 

co 

> 

In 

O 

■a 

€> 

<h 

c 

rH 

•) 

.O 

•o 

« 

t> 

CD 

•H 

c 

c 

h 

•H 

« 

«3 

a) 

c 

> 

■p 

8 

.o 

CO 

O 

a 

4) 

o 

o 

^ 

ct, 

^ 


C>>       Q\ 


rH         ^J" 


x     x: 

■P         -P 

&    B 


151 


00 

o 

u 

o 

o 

t/> 

c 

o 

*-i 

^ 

O 

80 

c 

(0 

€) 

«Q 

C 

X 

® 

o 

© 

a 

H-l 

h 

«S 

-p 

O, 

H 

as 

§ 

t) 

> 

O 

> 

9 

•H 

CM 

Q 

T3 

fc 

C> 

13 

cd 

h 

r-\ 

k 

o 

£> 

03 

CO 

<H 

OC 

•o 

c 

E* 

c 

3 

-a 

as 

rH 

t) 

•p 

«3 

c 

w 

> 

03 

T3 

m 

-p 

c 

rH 

p 

« 

03 

O 

GO 

•H 

c 

Ih 

OS 

03 

« 

> 

s 

to 

t) 
O 

2 

0, 

CN 

en 

-* 

WN. 

fH 

0^ 

CM 

r-t 

o 

so 

• 

■ 

• 

• 

• 

CM 

H 

rH 

CM 

en 

00 

CM 

-* 

00 

^J- 

■* 

^ 

00 

00 

CM 

m 

CM 

CM 

O 

CM 

O- 

U~\ 

en 

00 

u-s. 

iH 

O 

rH 

VPV 

u-\ 

• 

• 

• 

• 

• 

CM 

rH 

rH 

rH 

en 

J" 

\o 

00 

00 

00 

CM 

O- 

00 

o 

VO 

• 

»A 

C^S 

en 

J- 

en 

CM 

m 

CM 

o 

O- 

o 

m 

f*> 

CM 

cv 

1-1 

CM 

iH 

t-K 

r-J 

en 

00 

\o 

SO 

J" 

so 

o 

en 

IN- 

O 

ON 

j» 

CM 

CM 

■tf 

en 

CM 

o- 

O- 

-^ 

CM 

O 

oo 

ON 

O 

0^ 

00 

iH 

O 

o 

<-4 

CM 

\o 

J* 

CM 

J* 

^ 

ITN 

CM 

CN 

vO 

-3" 

• 

• 

o- 

-3" 

J" 

vO 

CM 

SO 

i-l 

f>- 

^- 

ON 

ON 

f>- 

o 

ON 

O 

• 

• 

r-i 

o 

r-l 

r4 

VTN 

O 

o 

CM 

O 

O 

O 

CM 

CN 

\o 

O 

C> 

C*> 

en 

CM 

c 
o 

•H 
-P 

as 

CM 

c 
o 

CO 

u 

•H 

CO 

o 

CO 

€) 

r-i 

C 

c 

ft 

€> 

>> 

«> 

X 

x: 

.c 

JC 

c 

w 

v 

-p 

-p 

•H 

1-4 

cd 

B 

3 

tM 

ft 

ft 

C 

rH 

E 

e 

at 

*> 

«) 

O 

W 

* 

o 

as 

O 

152 


No  significant  tape-by-compression  interaction  effects 
were  found  to  operate  upon  any  one  of  the  four  process 
variables  or  upon  comprehension. 

Discussion 

Of  the  four  process  variables  tested,  accurate  empathy 
was  the  only  variable  that  was  found  to  be  significantly 
affected  by  compression.   The  mean  empathy  value  was  con- 
spicuously higher  at  40%  compression.   Why  empathy  was  the 
only  variable  affected  by  compression  was  not  answered  in 
this  study. 

It  appears  unlikely  that  the  outcomes  were  due  to  random 
ratings  by  the  subjects  at  increased  levels  of  compression. 
Subjects  had  considerable  training  and  experience  in  counsel- 
ing and  counselor  education.  Moreover,  the  correlation  between 
the  rank-ordering  of  the  tapes  based  on  the  subjects1  evalua- 
tion and  the  rank-order  derived  from  the  three  judges*  evalua- 
tion indicated  a  nonsignificant  but  substantial  agreement 
between  the  judges  and  the  subjects  in  their  evaluation  of 
the  experimental  tapes.   Spearman  rank-order  correlation  co- 
efficient r  was  .70.   Nor  was  it  likely  that  the  outcomes  were 
due  to  subjects*  idiosyncratic  tendency  to  rate  tapes  high 
or  low  and  personal  bias  factors  in  assessment.   By  virtue  of 
the  experimental  design,  such  response  biases  were  evenly 
distributed  between  levels  of  compression  and  between  tapes. 
Assignments  of  subjects  to  the  groups  and  the  groups  to  treat- 


153 


ments  were  made  randomly.   While  all  the  subjects  listened  to 
all  five  tapes  at  some  level  of  compression,  no  subjects 
listened  to  the  same  tape  twice.   Furthermore,  the  highly 
significant  P  ratio  obtained  for  the  effects  of  five  tapes 
upon  process  variables  provided  evidence  that  the  subjects 
did  indeed  differentiate  the  quality  of  variables  of  five 
different  tapes,  despite  the  increasing  rate  of  presentation. 

A  considerable  trend  was  observed  among  mean  values  of 
the  process  variables  obtained  at  five  compression  levels. 
They  were  always  found  ordered  in  the  same  sequence,  although 
the  differences  between  means  were,  for  the  most  part,  small. 
The  mean  values  were  highest  at  b0%   compression,  followed  by 
20$,  then  0%,    and  the  lowest  was  at  30%   compression.  Mean 
values  at  50$  compression  did  not  fit  into  this  general  trend. 
Why  the  process  variables  for  the  four  compression  levels  were 
always  ordered  in  the  same  way  could  not  be  answered  by  this 
study. 

Most  striking  is  the  fact  that  on  all  four  process 
variables  the  highest  mean  values  obtained  were  invariably 
at  the  k0<fi   level  of  compression.   Is  it  possible  that  process 
variables  such  as  empathy,  warmth,  genuineness,  and  self- 
exploration  are  better  assessed  at  moderately  compressed  levels 
with  listeners  attending  to  the  counseling  dialogue  ever  so 
intently  and  carefully?  Alternatively,  it  may  be  that  faster 
moving  dialogue  tends  to  make  the  listener  less  critical  and 
thereby  a  higher  rater  of  the  process  variables  than  when  he 


154 


listens  to  relatively  slow  dialogue  at  the  original  speed. 
Since  the  values  of  process  variables,  as  they  are  defined 
in  the  Truax  scales,  were  primarily  based  on  verbal  inter- 
actions between  the  counselor  and  the  client,  the  dialogue 
had  to  be  understood  for  the  subjects  to  evaluate  it.  At 
the  k0%   level  of  compression,  comprehension  was  fairly  good, 
with  12  subjects  obtaining  comprehension  scores  of  80$  or 
better.   It  might  be  that  the  speed  of  presentation  of 
counseling  dialogue  at  40$  compression  was  slow  enough  to 
leave  the  verbal  content  comprehensible  for  the  listeners 
but  not  so  slow  as  to  allow  the  listeners  to  become  overly 
critical. 

Comprehension  declined  gradually  as  the  rate  of  pre- 
sentation increased  and  showed  a  sharp  decline  at  50$  com- 
pression.  The  Scheffe  method  applied  to  post-hoc  pair-wise 
comparisons  revealed  that  the  decline  at  50$  was  responsible 
for  the  significant  F  ratio.   Two  subjects  at  40$  compression 
and  three  subjects  at  50$  compression  obtained  a  comprehen- 
sion score  below  15,  a  chance  score  on  a  30-item  test.   How- 
ever, even  at  50$  compression,  five  subjects  showed  better 
than  80$  comprehension  and  seven  subjects  scored  ?0  to  79$. 

With  respect  to  comprehension,  the  results  of  the  present 
study  cannot  be  readily  compared  with  previous  findings  on 
the  effects  of  compression  upon  comprehension.   Factors 
foreign  to  oral  reading,  with  which  previous  studies  were 
concerned,  such  as  the  overlapping  of  conversations,  speech 


155 


accompanied  by  nervous  laughter,  and  careless  enunciation, 
appeared  to  interfere  with  listening  more  at  compressed  levels 
than  at  the  original  speed.   On  the  other  hand,  the  words  and 
sentence  structure  in  counseling  dialogue  were  rather  simple 
as  compared  to  those  of  reading  materials  used  in  other 
studies,  which  in  all  likelihood  aided  comprehension.   Never- 
theless, the  results  of  this  study  were  similar  to  previous 
findings  with  oral  reading  materials,  which  indicated  that 
comprehension  was  not  significantly  affected  up  to  250-300wpm. 
In  this  study,  at  50$  compression,  at  which  a  significant 
decline  in  comprehension  occurred,  the  word  rate  ranged  from 
2?8wpm  to  3^0wpm. 

No  effects  of  interaction  between  tapes  and  compression 
levels  upon  four  process  variables  were  found  to  be  statisti- 
cally significant,  nor  was  the  interaction  effect  upon  content 
comprehension  significant.   However,  the  data  suggested  that 
tapes  in  which  the  speakers  were  articulate  were  comprehended 
somewhat  better,  which  was  especially  evident  at  higher  levels 
of  compression.   In  fact,  several  subjects  indicated  that  the 
speakers'  voice,  not  the  speed,  caused  difficulty  in  listen- 
ing to  some  tapes.   It  appears  that  the  degree  of  compression 
alone  did  not  account  for  the  decline  in  comprehension. 

The  subjects'  responses  to  questionnaires  indicated 
that  several  subjects  found  noncompressed  tapes  to  be  too 
slow  and  20$  and  30%  compression  (20^-24lwpm)  to  be  most 
suitable  speeds  of  presentation.   These  subjects  reacted 


156 


favorably  to  the  efficiency  gained  through  the  use  of  com- 
pression and  found  their  heightened  attention  at  the  20%  and 
30%  levels  of  compression  to  be  desirable  for  listening. 
The  same  speeds  were  experienced  by  some  subjects  as  being 
too  fast,  and  the  demand  on  their  attention  was  felt  as  an 
unnecessary  and  unpleasant  psychological  strain.   It  appears 
that  while  some  subjects  found  compressed  counseling  tapes 
intriguing  and  fascinating,  the  newness  of  the  experience 
caused  a  psychological  stress  and  resistance  among  other 
subjects. 

Conclusions 

The  results  of  the  study  suggest  that  a  moderate  degree 
of  compression  does  not  distort  or  diminish  affective  qualities 
expressed  in  a  dialogue,  nor  does  it  affect  the  comprehension 
of  dialogue.   For  subjects  in  this  study,  the  maximum  com- 
pression level  appeared  to  be  30%.   Though  individual  dif- 
ferences in  tolerating  speeded  auditory  input  can  be  expected, 
some  training  in  listening  to  compressed  tapes  on  the  part 
of  subjects  might  change  the  apparent  limit  discovered  in  this 
study  to  even  higher  levels  of  compression. 

Applied  to  counseling  research  and  supervision,  time- 
compression  would  allow  counselor  educators,  supervisors, 
and  students  to  listen  to  counseling  tapes  in  less  time  but 
with  as  much  insight  as  they  could  have  gained  through  listen- 
ing to  tapes  at  original  speeds,  or  possibly  with  better  content 


157 


comprehension  and  greater  appreciation  of  process  variables 
due  to  their  heightened  attention.  The  elevation  of  the  mean 
process  variable  values  at  b0%   compression  which  was  observed 
in  this  study  appears  to  parallel  findings  from  some  previous 
studies  which  indicated  that  comprehension,  recall,  or  reten- 
tion scores  decreased  as  degrees  of  compression  increased  but 
improved  again  at  higher  levels  of  compression  probably  attrib- 
utable to  a  better  concentration  on  the  part  of  listeners 
(Orr  &  Friedman,  1965i  Orr,  Friedman,  &   Williams,  1965) . 

Time  saved  through  the  use  of  compressed  tapes  could  be 
more  profitably  devoted  by  counselors  and  counselor  educators 
to  other  professional  activities  which  demand  their  attention. 
Supervision  through  audio  tapes  could  also  be  improved  as  a 
result  of  supervisors*  hearing  the  entire  counseling  tape  at 
a  compressed  level  rather  than  attending  to  only  portions  of 
the  noncompressed  tapes,  thereby  gaining  a  clearer  idea  of  the 
development  of  the  counselor-client  interaction.   With  the 
current  introduction  of  tape  recorders  equipped  with  contin- 
uous rate  control,  it  would  be  possible  for  an  individual  to 
listen  to  the  beginning  of  a  tape  at  the  original  speed  to 
gain  a  flavor  of  the  natural  flow  of  counseling  dialogue  and 
to  speed  up  or  slow  down  thereafter  to  suit  his  interests 
and  listening  ability. 

The  results  obtained  from  the  study  need  to  be  further 
substantiated  through  replications  of  the  study.  In  future 
studies  consideration  may  be  given  to  the  adoption  of  different 


158 


devices  to  evaluate  counseling  tapes,  a  wider  selection  of 
experimental  tapes,  possibly  including  both  male  and  female 
counselors  and  clients,  and  the  introduction  of  other  in- 
dependent variables  such  as  subjects'  listening  ability  and 
subjects'  experience  in  listening  to  compressed  dialogue. 


159 


References 


Fairbanks,  G.t  Guttman,  N. ,  &  Miron,  M.  S.   Effects  of  time 
compression  upon  the  comprehension  of  connected  speech. 
Journal  of  Speech  &  Hearing  Disorders.  1957$    22,  10-19. 

Foulke,  E.,  Amster,  C.  H.,  Nolan,  C.  Y.,  &  Bixler,  R.  H. 
The  comprehension  of  rapid  speech  by  the  blind. 
Exceptional  Children.  1962,  29,  134-14-1. 

Friedman,  H.  L.,  Orr,  D.  B.,  &  Graae,  C.  N.   Further  research 
on  speeded  speech  as  an  educational  medium — materials 
comparison  experimentation.   Final  report.   Washington, 
D.  C.i  American  Institutes  for  Research,  1967. 

Goldhaber,  G.  M.   Listener  comprehension  of  compressed  speech 
as  a  function  of  the  academic  grade  level  of  the  subjects. 
Journal  of  Communication.  1970,  20,  167-173. 

Miller,  G.  A.,  &  Licklider,  J.  C.  R.   The  intelligibility  of 
interrupted  speech.   Journal  of  the  Acoustical  Society 
of  America.  1950,  22,  167-173. 

Orr,  D.  B.,  &   Friedman,  H.  L.   Research  on  speeded  speech  as 
an  educational  medium.   Project  report,  BR-5-0801. 
Washington,  D.  C.i  American  Institutes  for  Research, 
1965.   ED  014  496 

Orr,  D.  B.,  Friedman,  H.  L. ,  &   Williams,  J.  C.  C.   Train- 
ability  of  listening  comprehension  of  speeded  discourse. 
Journal  of  Educational  Psychology.  1965,  56,  148-156. 

Rogers,  C.  R.   On  becoming  a  person.   Bostom  Houghton  Mifflin, 
1961. 

Sticht,  T.  G.   Some  relationships  of  mental  aptitude,  read- 
ing ability,  and  listening  ability  using  normal  and 
time-compressed  speech.   Journal  of  Communication. 
1968,  18,  243-258. 

Sticht,  T.  G.   Studies  on  the  efficiency  of  learning  by 

listening  to  time-compressed  speech.   Alexandria,  Va.i 
The  George  Washington  University,  Human  Resources 
Research  Organization,  1970. 

Truax,  C.  B.,  &  Carkhuff,  R.  R.   Toward  effective  counseling 
and  psychotherapy.   Chicagos  Aldine,  1969. 


160 


Temporal  Redundancy  and  the  Esthetics  of  Time- Compressed 
Speech:    A  Conceptualization 
by  Glasser,    T.  L. 


Theodore  L.  Olasser  is   an  instructor  and  doctoral  candidate  in  mass   communication 
at  the  University   of  Iowa. 


161 


ABSTRACT 

The  relationship  between  temporal  redundancy  and  the  perception  of  time- 
compressed  speech  is  an  important  one.  There  exists,  in  theory,  an  inverse  re- 
lationship between  temporal  redundancy  and  time-compressed  speech;  that  is,  as 
speech  is  compressed  in  time  there  is  a  proportional  decrease  in  temporal  re- 
dundancy. Thus,  we  need  only  compress  soeech  to  what  mifht  be  called  an  "optimum" 
level  of  temporal  redundancy  —  a  level  at  which  the  only  remaining  signal  dura- 
tion is  that  which  is,  presumably,  essential  for  comprehension. 

It  follows,  then,  that  the  optimum  level  of  signal  duration  is  more  or  less 
an  index  of  the  most  efficient  rate  of  speech  for  purposes  of  comprehension.  And 
yet,  as  Holes  (1966)  argues,  what  we  glean  from  human  speech  often  transcends  the 
normative  value  of  comprehension.  What  may  be  the  most  efficient  rate  of  speech 
for  purposes  of  comprehension  may  be  a  far  less  efficient  rate  for  what  Moles 
calls  "esthtetic"  information.   It  would  be  folly,  therefore,  to  believe  that 
the  only  measure  of  "efficiency"  is  the  normative  value  of  comprehension. 

This  paper  explores  further  the  distinction  lioles  makes  between  semantic  and 
esthetic  information.  Ultimately,  the  objective  of  this  paper  is  to  demonstrate 
the  need  to  view  temporal  redundancy  in  a  somewhat  broader  context,  thus  allowing 
for  some  consideration  of  the  esthetic  value  of  time -compressed  speech. 


162 


TEMPORAL  REDUNDANCY  AND  THE  ESTHETICS 
OF  TIME-COMPRESSED  SPEECH 

Why  speech  remains  comprehensible  even  when  portions  of  phonemes  have  been 
discarded  is  due  largely  to  the  existence  of  temporal  redundancy,  a  concept  re- 
ferred to  nodestly  by  Fairbanks,  Everitt  and  Jaeger  ''lS^M  as  a  "useful  speci- 
fication." Conceptually,  temporal  redundancy  may  be  defined  as  the  duration  of 
phonemes  in  connected  discourse  beyond  the  minimum  duration  necessary  for  audi- 
tory perceDtion.  Simply  put,  the  amount  of  time  afforded  most  phonemes  by  the 
speaker  far  exceeds  the  amount  of  time  needed  by  the  listener  for  the  perception 
of  ohonemes ;  hence,  phonemes  are  more  or  less  temporally  redundant. 

Whenever  a  speech  signal  is  reproduced  in  less  time  than  the  time  required 
for  its  original  production  (i.e.  compressed  in  time),  the  duration  of  phonemes 
—  and  thus  temporal  redundancy  —  is  invariably  reduced.   Indeed,  virtually 
any  attempt  to  compress  speech  in  time  can  be  characterized  as  an  attempt  to  re- 
duce the  duration  of  ohonemes  to  some  "optimum"  level  —  a  level  at  which,  pre- 
sumably, the  remaining  duration  is  that  which  is  essential  for  comprehension  or, 
perhaps,  intelligibility.  This  optimum  level  is,  however,  somewhat  dependent  on, 
among  other  factors,  (i)   how  speech  is  compressed  in  time,  and  (ii)  the  task  re- 
quired of  the  listener.  Certain  methods  of  compression  —  most  notably  the  "in- 
creasing the  rate  of  olavback"  method  —  introduce  additional  "noise"  (e.g.  pitch 
distortion)  and  thus  require  additional  temporal  redundancy.  As  for  the  listen- 
er's task,  the  findings  of  Foulke  and  Sticht  (1967),  among  others,  support  the 
hypothesis  that  phonemes  are  more  temporally. redundant  when  intelligibility  is 
the  dependent  variable,  and  considerably  less  redundant  when  comprehension  is  at 
issue.  By  wav  of  explanation,  Overmann  (1971:  IO6-IO7)  reasons  that  the  processing 


163 


time  required  for  comprehens ion  is  greater  than  the  time  needed  for  intelligi- 
bility. "The  listener's  task,  when  he  is  required  to  demonstrate  comprehension, 
is  more  difficult  than  the  task  required  when  intelligibility  is  measured." 
Thus,  the  mere  recognition  of  phonemes  in,  say,  a  list  of  words  requires  less 
signal  duration  than  the  comprehension  of  those  same  words  in  the  context  of 
meaningful  discourse.   (In  the  case  of  comprehension,  of  course,  phonemes  them- 
selves are  not  literally  "understood",  but  are  rather  mediated  through  the  total- 
ity of  larger  perceptual  units . ) 

In  short,  the  relationship  between  temporal  redundancy  and  the  perception  of 
time-compressed  speech  is  an  important  one,  whether  perception  be  operationally 
defined  as  comprehension  or  intelligibility.  What  is  unfortunate,  however,  is 
that  comprehension,  and  to  a  lesserjextent  intelligibility,  have  come  to  be  re- 
garded as  the  sole  criteria  for  the  efficacy  of  time -compressed  speech.  For 
when  comprehension  and  intelligibility  become  the  predominant  dependent  variables 
(as  they  have),  the  corresponding  supposition  is  that  the  durations  of  phonemes 
are,  at  least  in  part,  necessarily  and  always  excessive  or  superfluous;  inevitably, 
temporal  redundancy  will  be  considered  almost  exclusively  in  its  pejorative  sense. 
That  the "information"  we  glean  from  human  speech  often  transcends  the  largely  norm- 
ative value  of  comprehension  is,  therefore,  a  consideration  too  frequently  ig- 
nored. That  temporal  redundancy  has  any  value  whatsoever  is  a  notion  seldom  af- 
forded conscious  scrutiny. 

The  intent  here,  simply,  is  to  explicate  a  rationale  for  viewing  temporal 
redundancy  in  a  somewhat  broader  context.  To  do  so,  we  intend  to  depict  human 
soeech  in  two  dimensions,  a  conceptualization  which  draws  heavily  from,  and  at 
times  closely  parallels,  the  work  of  Abraham  Moles  (1966),  although  unlike  Moles 
we  place  less  emphasis  on  the  theorems  of  information  theory.  Ultimately,  the 
objective  of  this  paper  is  to  demonstrate  that  any  evaluation  of  the  "efficiency" 


164 


of  time -compressed  speech  must  go  beyond  such  normative  considerations  as  compre- 
hens  ion/intelligibility. 

The  Semantic  and  Esthetic  Dimensions  of  Human  Speech 

Excluding  only  teoretical  extremes,  all  human  speech  can  be  said  to  have 
both  a  semantic  and  esthetic  dimension.  Often  the  two  dimensions  are  readily 
distinguishable;  at  times,  however,  the  distinction  is  a  frustratingly  fine  one. 
Although  the  semantic  and  esthetic  dimensions  are  not  to  be  viewed  as  mutually 
exclusive  or  dichotomous,  they  will  be  portrayed  here  as  two  distinct  entities. 

Semantic  Information 

Semantic  information,  as  conceived  by  Moles ,  is  logical,  extensional,  and 
largely  utilitarian;  its  intent  is  to  preoare  action.  An  order,  instructions,  a 
lecture  all  convey  essentially  semantic  information,  althourh  such  messages  are 
by  no  means  without  an  esthetic  component.  Noles  (1966:  132)  uses  radio  news 
to  illustrate: 

Part  of  the  news  on  radio  is  obviously  semantic  information;  it 
determines  the  reactions  of  most  of  the  public,  or  at  least  a 
logically  definable  fraction  of  the  auditors.  This  part  is,  for 
example,  the  weather  forecast,  if  we  want  to  go  out  the  next  day; 
stock  market  reports,  if  we  are  shareholders;  administrative  in- 
formation, if  we  are  under  an  administration;  laws  and  decrees, 
if  we  are  under  a  government. 
Semantic  information,  then,  would  appear  to  be  far  more  pragmatic  than 
appreciative.   Generally,  those  messages  which  have  a  dominant  semantic  dimen- 
sion are  those  which  have,  at  least  potentially,  some  practical  utility.  Above 
all  else,  though,  the  semartic  dimension  is  content -bound,  and  can  thus  be  viewed 


165 


in  terms  of  its  symbolic  import.  Appropriately,  semantic  information  can  be  re- 
garded as  normative,  and  is  therefore  relatively  easy  to  objectify  since  it  is 
proportional  to  comprehension,  a  variable  which  has  long  lent  itself  to  quanti- 
fication (see  I'oles ,  1966:  lli3).  Moreover,  the  semantic  dimension  is  not 
channel-specific;  a  given  message  would,  in  all  likelihood,  retain  equivalent 
semantic  information  whether  it  be  made  available  through,  say,  an  audio  or  visual 
channel. 

To  judge  the  efficiency  of  speech  for  conveying  semantic  information,  we  need 
only  posit,  as  have  Woodcock  and  Clark  (1968),  that  efficiency  equals  the  treat- 
ment mean  minus  the  "test-only"  mean  over  the  listening  time  in  minutes.  Diagram- 
matically,  a  formula  for  measuring  listening  efficiency  might  be: 


Treatment  I  ean  -  "Test-Only  Mean 
Listening  Efficiency  =  — 


Listening  Time  In  I  inutes 


In  the  context  of  the  semantic  dimension,  temporal  redundancy  is  indeed 
"wastage";  the  duration  of  phonemes  beyond  the  minimum  duration  necessary  for 
comprehension  is,  by  definition,  redundant.  Thus,  to  improve  the  efficiency  of 
speech  —  for  purposes  of  exploiting  only  the  semantic  dimension  —  the  most 
appealing  strateries  would  be  those  which  were  calculated  to  reduce  temporal 
redundancy  to  an  optimum  level. 

Esthetic  Information 

The  esthetic  dimension,  unlike  the  semantic,  is  largely  intensional;  it 
does  not  preoare  for  action,  nor  can  it  be  said  that  esthetic  information  has  any 
"goals"  in  terms  of  facilitating  social  activity.  Accordingly,  I.oles  (1966:  132- 
133)  characterizes  the  esthetic  dimension  as  principally  gratuitous.   "It  'in- 
forms' in  the  common  sense  of  the  word.  It  communicates  anger  or  euphoria  with- 


166 


out  determining  any  present  or  future  reactions...." 

...in  a  speech,  an  orator  tries  to  convince  as  much  by  the 
warmth,  attractiveness,  and  persuasiveness  of  his  voice  as  by 
the  purely  logical  implications  of  what  he  states. 

In  a  theatre  play,  the  argument,  the  action,  the  story 
told,  as  well  as  the  logical  implications,  are  semantic  inform- 
ation. The  movements  of  the  actors,  the  warmth  of  their  voice?, 
their  expressions,  the  richness  of  the  scenery  are  chiefly  esthe- 
tic information. 
Although  both  semantic  and  esthetic  information  are  contingent  upon  what  an 
individual  can  assimilate  from  a  given  message,  only  the  former  can  be  thought  of 
as  normative.  The  esthetic  dimension  transcends  the  logical  structure  and  order- 
liness of  the  semantic  dimension;  esthetic  information  is  ipsative,  a  kind  of 
"personal"  information.  There  are,  therefore,  few  a  priori  assumptions  to  be 
made  about  appropriate  or  inappropriate  (i.e.  right  or  wrong''  "understandings"  of 
esthetic  information.  And  in  contrast  to  the  semantic  dimension,  the  esthetic 
dimension  is  very  much  channel-specif ic;  that  is,  unlike  semantic  information, 
esthetic  information  is  not  "translatable"  from  one  channel  to  another. 

Since  the  esthetic  dimension  transcends  the  content  of  a  message,  it  would  be 
somewhat  of  a  non  sequitur  to  talk  about  what  is  and  what  is  not  redundant  —  es- 
pecially in  the  context  of  comprehension.   Indeed,  esthetic  information  is  rarely 
"banal"  in  the  sense  of  truly  redundant  semantic  information;  esthetically,  there 
is  always  some  "residual"  information  to  be  gleaned.  This  partly  explains  how  we 
can  watch  a  movie  over  and  over  again  and  — .although  we  may  have  a  similar  under- 
standing (normatively)  each  time  —  have  a  very  different  "appreciation"  with  each 
subsequent  viewing.  The  notion  of  temporal  redundancy  too,  therefore,  is  not 
especially  relevant  to  the  esthetic  dimension.  The  durations  of  phonemes  function 


167 


in  very  different  ways  as  we  move  from  the  semantic  to  the  esthetic.  To  reduce 
signal  duration  (ostensibly  in  an  attempt  to  improve  the  efficiency  of  speech) 
may  radically  change  the  esthetic  dimension.  Thus,  it  would  be  folly  to  posit 
that,  _in  the  context  of  esthetic  information,  the  durations  of  phonemes  are  nec- 
essarily excessive. 

"Communication-Pleasure"  and  the 
Euphony  of  Human  Speech 

Anthropologists  have  long  noted  the  relative  insignificance  of  content  in  a 
variety  of  speech  transactions.  More  than  half  a  century  ago,  Kalinowski  (1923: 
309-316 )  studied  a  type  of  communicative  behavior  which  was,  as  he  described  it, 
a  "communion  of  words",  the  establishing  and  maintaining  of  communication  for  the 
sake  of  fellowship  and  little  else.  What  Kalinowski  sought  to  understand  were 
those  truly  uninspired  verbal  amenities  which  serve  only  to  create  an  "atmos- 
phere of  sociability." 

As  John  ^ewev  reminds  us,  "Communication  is  consummatory  as  well  as  instru- 
mental." 

In  many  ways  the  esthetic  dimension,  as  conceived  by  Moles,  may  be  likened 
to  Stephenson's  (1967)  description  of  communicat ion-pleasure ,  a  conception  which 
calls  for  a  largely  cultural  account  of  "self -enhancement."   ^or  Stephenson, 
communicat ion -pleas ure  centers  on  a  highly  subjective  (i.e.  personal)  experience 
without  expectancy  of  change  or  action.  For  example: 

When  two  people  meet  and  converse,  they  may  say  afterwards  how 
much  they  enjoyed  it.  They  have  been  talking  in  a  complex  way, 
now  serious,  now  in  fun,  now  at  cross -purposes ,  now  with  gusto, 
in  intricate  interaction.  The  talk  serves  no  aoparent  purpose 
as  far  as  one  can  see;  one  person  is  not  necessarily  trying  to 


168 


convince  the  other,  to  subdue  the  other,  to  get  anything  out  of 
the  other.  They  are  not  trying  to  please  one  another  —  nor  is 
the  one  in  some  remote  degree  having  fantasy  about  the  other, 
or  seeking  to  seduce,  influence,  or  in  any  way  become  involved 
in  the  other's  purposes.  Afterwards,  they  both  say  how  pleasant 
it  was.  This  is  communication-pleasure:  its  characteristic  is 
that  the  two  so  talking  are  not  expecting  anything.   (Stephen- 
son, 1967:  57) 
It  is,  simply,  the  "passing  of  time"  which  is,  in  many  instances,  the  primary 

function  of  speech.   Thus,  to  drastically  reduce  the  amount  of  time  consumed  may 

well  be,  at  times,  dysfunctional. 

The  Fuphony  of  Human  Speech 

One  could  hardly  argue  that  communication-pleasure  is  derived  solely  from  a 
pleasant  or  pleasing  voice.  Still,  it  is  true  that  communication  is  often  enjoyed 
for  its  own  sake,  and  that  there  is  an  "intrinsic  delight"  in  sensory  perception. 
The  mere  sound  of  human  speech,  it  follows,  is  a  value  worthy  of  some  considera- 
tion. 

When  speech  is  conroressed  in  time  —  when  the  durations  of  Dhonemes  are  re- 
duced —  the  "naturalness"  of  human  speech  is  necessarily  altered.  As  the  "growth" 
and  "decay"  of  vowels  and  other  continuants  are  reduced,  plosives  tend  to  become 
more  perceptually  prevalent.  Although  this  may  not  adversely  affect  comprehension, 
what  impact  will  it  have  on  those  who  have,  for  one  reason  or  another,  a  per- 
ceptual disposition  to  hear  natural  or  "uncompressed"  speech?  To  w hat  extent,  we 
must  ask,  is  time-comoressed  speech  "de-euphonized"  speech? 

There  is,  further,  some  evidence  to  suggest  that  increased  rates  of  speech 
may  result  in  perceived  anxiety.  And  there  is  reason  to  believe  that  this  per- 
ceived anxiety  may  be  "infectious,"  thus  producing  anxiety  in  the  listener  (see 

169 


Mahl  and  Schulze,  196U:   8U).     Again,  although  such  anxiety  nay  not  detract  from 
the  listener's  ability  to  comprehend,   to  what  extent  is  an  "anxious  listener"  less 
apt  to  be  " self -enhanced" ?     What  is  at  issue  here  is  the  dynamics  of  human  speech. 
There  are,  clearly,   a  host  of  extralinguistic  phenomena  to  which  we  must  devote 
considerable  attention  before  we  can  say,  with  any  degree  of  certainty,  that  time- 
compressed  speech  is   applicable   in  one  context  and  not  in  another. 

On  the  "Efficiency"   of  Time-Compressed  Speech 

Much  of  the  research  on  time-compressed  speech  seems  to  imply  that  efficiency 
is  simply  a  matter  of  how  fast  auditory  data  can  be  "successfully"   transmitted. 
With  few  exceptions,   "successful  transmission"  means  the  data  were  either  intel- 
ligible or  comprehensible.     But  as  we  have  tried  to  demonstrate,   this    is  an  un- 
necessarily limited  view  of  the  value  and  functions  of  human  speech.     More  ap- 
propriately, speech  can  be  said  to  be  "efficient"  only  relative  to  the  context  in 
which  it  is  used  and  the  uses   to  which   it  is  put. 

T^rom  Administrative  to  Critical  Research 
A  Methodological  Direction 

"It  is  very  poorly  known  and  difficult  to  measure,"   says  Moles    (1966:  132) 
about  esthetic   information.     Measures   of  comprehension  and   intelligibility,  no 
matter  how  inclusive  the  dependent  variables  may  be,  will  tell  us   little  about 
the  esthetic  dimension.     Communication-pleasure   is  not,    in  other  words,   discover- 
able by  means   of  normative  measures.     To   identify  the  esthetic   dimension  —  to 
determine   the  scope  and  relative  importance   of   communication-pleasure   —  Stephen- 
son (1953:    68)  proposes   his  Q -Methodology,  which  "...provides   a  systematic  way 
to  handle  a  person's   retrospections,   his   reflections  about  himself  and  others, 
his   introjections  and  projections,   and  much  else  of  an  apparent  subjective 
nature/'   Essentially,  Q-Methodology  involves  what   is  called  "Q-Sort  Technique": 


170 


an  individual   is  asked  to  "sort"    (discriminate  atrong)  a   "universe"   of  self- 
referential   (and  often  self -gene rated)  statements    (or  other  stimuli)   along  some 
continuum.     But  before  Stephenson's  methodology  becomes  a  useful  tool,   there  will 
be  a  need  to  recognize  that  human  speech   is  not  exclusively  instructive  in  pur- 
pose;  or,   if  such   is   the  purpose,   it  need  not  necessarily  be  the  function  as  well. 
To  date,   researchers  have  denied  —  albiet  implicitly  —  the  existence  of  anything 
but  a   semantic  dimension.     At  best,  when  an  esthetic  dimension  _is   acknowledged, 
it   is    invariably  relegated  to  the  "beyond  our  purview"   category. 

The   prevailing  question  in  most  of  the   research  on  time -compressed  speech 
has  been  of  the  ",That  effect  does....?"   variety.      nhe  question  epitomizes  what 
T..azarsfeld   (19U1)   calls   "administrative"   research,   a  type  of  research  character- 
istically narrow  in  scope  and  well-defined   in  purpose.     Now,   administrative  re- 
search  is   not  to  be  slighted,   for  such  studies   contribute  greatly  to  an  under- 
standing of  the  phenomenon  in  question.      But  as   the  technology  for  time-compressed 
speech  moves  away  from  the  laboratory  and   into  the  home,    the  administrative 
approach  becomes   less  and  less  desirable.     As   time -compressed  speech  becomes 
widely-diffused,   the  research  question  must  shift  from  "What  Does  Time -Compressed 
Speech  Do  To  People"  to  "What  Do  PeoDle  Do  With  Time -Compressed  Speech?"     And 
eventually,   even  the  latter  will  have  to  modified  to  ask:   "What  Should  People  Do 
With  Time-Compressed  Speech?"     ^he  type   of   research  to  which  we  now  allude   is 
critical  research;    in  Lazarsfeld's  words    (I9hl:  9),    it  is   the   type  of  research  which 
"...seems  to  imply  ideas   of  basic  human  values  according  to  which  all  actual  or 
desired  effects  should   be  appraised."     To  discover  the  esthetic   dimension  requires 
critical  research,   for  communication-pleasure   is  a  matter  of  human  values. 

The  important  point   is,  simply,   that  time-compressed  speech  —  not  unlike 
most  any  technology  —  is    itself  valueless.     It   is   only  the  use  we  make  of  it 
that  can  be   said  to  have  any  value. 


171 


References 

Emerson  Foulke  and  Thomas  G.  Sticht   (1967 )  "The  Intelligibility  and  Comprehension 
of  Time -Compressed  Speech,"  pp.   21-28  in  Emerson  ^oulke   (ed.)   Proceedings   of 
the  Louisville  Conference  on  Time-Compressed  Speech.   Louisville:  University 
of  Louisville. 

Grant  Fairbanks,  W.L.  Everitt,   and  R.P.   Jaeger  (195U)  "Method  for  Time  or  Fre- 
quency Compress  ion -Expansion  of  Speech."     Transactions   of  the  Institute  of 
Radio  Engineers,   Professional  Group  on  Audio,   AU2   (1):   7-12.     Also    in  Sam 
Duker   (ed.)  Time-Compressed  Speech:  An  Anthology  and  Bibliography   in  mhree 
Volumes ,   pp.   172 -IPO.     New  Jersey:   The  Scarecrow  Press,  197h. 

Paul  Felix  Lazarsfeld   (l9hl)   "Remarks   on  Administrative  and  Critical  Communications 
Research."     Studies   in  Philosophy  and  Social  Science.  IX:    2-16. 

George  F.  Mahl  and  Gene  Schulze    (196U)     "Psychological  Research  in  the  Sxtralinguistic 
Area,"  pp.   51-12U   in  T.A.  Sebeok,   et  al.    (eds.)  Approaches   to  Semiotics. 
The  Hague:   Kouton. 

Bronislaw  Ealinowski   (1923)   "The  Problem  of  Meaning   in  Primitive  Lpnguages," 
pp.   296-336   in  C.K.   Ogden  and  T.A.  Richards  The  feaning  of  Meaning.     New 
York:  Hare our t,  Brace  and  World. 

Abraham  Moles    (1966)    Inform -ition  Theory  and  Esthetic  perception   (J.E.   Cohen,   trans.). 
Urbana:   University  of  Illinois   Press. 

Ruth  Ann  Overmann   (1971)   "Processing  Time  As  A  Variable  In  The  Comprehension  of 
Time-Compressed  Speech,"   pp.   103-118  in  E,erson  Foulke  (ed.)   Proceedings   of 
the  Second  Louisville  Conference  on  Rate  and/or  F requency-Cont rolled  Speech. 
Louisville:   University   of  Louisville. 

William  Stephenson   (1967 )  The  Play  Theory  of  Mas s  Communication.     Chicago:  The 
University  of   Chicago  Press. 

William  Stephenson  (1953)  The  S tudy  of  Behavior:  Q -Technique  and  Its  Methodology. 
Chicago:  The     nivereity  of  Chicago  Press. 

Richard  W.  Woodcock  and  Charlotte  R.  Clark   (1968^   "Comprehension  of  a  Narrative 
Passage  by  Elementary  School  Children  as  a  Function  of  Listening  Rate, 
Retention  Period,  and  TQ."     Journal  of   Communication.     18:    259-271. 


172 


Toward  a  Theory  of  Rate- Controlled  Speech 
by  Meadows,    C.  L. 


Charles  L.    Meadows 

Director,    Foreign  Language  &  Special  Learning  Laboratories 

Morehouse  College 

Atlanta,    Georgia    30314 


173 


Abstract 

TOWARD  A  THEORY  OF  RATE- CONTROLLED  SPEECH 

Charles  L.    Meadows 

Rate-controlled  speech  has  attracted  the  interest  of  persons  from 
many  and  widely  varied  sectors  of  the  academic  and  professional  world. 
Although  their  general  interest  has  been  the  same-- rate-controlled  speech-- 
their  backgrounds,   motives,   and  particular  interests  have  varied  greatly. 

Though  beneficial  in  many  ways,   this  fact  has  also  been  a  significant 
deterent  to  careful  and  systematic  research  in  this  area.     Consequently, 
only  limited  progress  has  been  made  toward  the  development  of  a  theory  of 
rate-controlled  speech. 

If  the  full  potential  of  rate -controlled  speech  is  to  be  realized,  pos- 
sible practitioners  must  be  able  to  call  upon  some  body  of  systematic  data 
or  theory. 

Even  a  cursory  examination  of  reported  studies  reveals  differences 
in  types  of  equipment  used,   kinds  of  tests  employed,   operational  procedures 
followed,    reporting  methods  utilized,    etc. 

These  different  techniques,   equipment  types,   etc.  ,   are  useful,   and 
indeed  necessary  if  we  are  to  fully  explore  and  develop  the  possibilities  of 
this  new  facility.     However,    certain  guidelines  must  be  developed  and  ob- 
served for  conducting  and  reporting  such  studies.     This  in  turn  should 
facilitate  more  meaningful  examination  and  interpretation  of  data  by  readers 
of  these  reports. 

When  and  only  when  these  steps  are  combined  in  a  systematic  manner, 
will  a  viable  theory  of  rate-controlled  speech  be  able  to  emerge. 


174 


"Toward  a  Theory  of  Rate -Controlled  Speech" 
If  the  full  potential  of  Rate  Controlled  Speech  is  to  be  realized,  prospective 

researchers,  practitioners,  as  well  as  consumers  must  be  able  to  call  upon  some  body 

of  realiable  and  systematic  data  or  theory  for  use  in  making  necessary  decisions  or 

in  drawing  necessary  conclusions . 

Such  a  body  of  systematically  developed  and/or  easily  accessible  data  does  not 

presently  exist. 

It  is  the  purpose  of  this  paper,  therefore,  to  do  at  leasttwo  things: 

1 .  To  list  a  few  reasons  why  present  practices  have  not  and  cannot  produce 
such  a  body  of  information. 

2 .  To  outline  several  steps  which  could  lead  to  the  correction  of  this  unfor- 
tunate situation. 

Rate -Controlled  Speech  has  attracted  the  interests  of  persons  from  many  and 
widely  varied  sectors  of  the  academic  and  professional  world.  This  group  includes 
those  working  with  the  blind,  linguists,  speech  &  communication  theorists,  psycholo- 
gists, and  audiologists.to  name  only  a  few.  All  of  these  user-investigators  have 
been  interested  in  some  form  of  Rate -Controlled  Speech.  However,  their  training, 
backgrounds,  motives,  and  particular  interests  have  varied  greatly. 

Communications  technicians  are  generally  highly  trained  in  telecommunications 
technology.  They  may,  however,  have  very  little  if  any  training  in  message  design 
or  the  proper  construction  of  listening  tests . 

A  worker  with  the  blind  may  be  expertly  trained  in, and  sensitive  to, special 
problems  related  to  the  blind  learner.  He  may  at  the  same  time,  however,  know 
little  about  advanced  tricks  of  the  trade  in.psychometrics.  On  the  other  hand, 
a  learning  specialist  with  excellent  training  in  psychometry  may  have  very  little 
knowledge  of,  and  even  less  access  to, appropriate  sound  equipment,  language  laboratory 
type  booths  or  other  materials  and  facilities  necessary  for  sophisticated  studies  in 
Rate -Controlled  Speech. 


175 


A  second  factor  which  has, for  rather  logical  reasons, caused  serious  problems  in 
analyzing  and  comparing  research  results  is — different  equipment  types.  In  recent 
years,  several  different  manufacturers  have  developed  their  own  speech-rate  changing 
machines.  In  a  relatively  new  field,  this  is  to  be  expected  and  even  desired.  Dif- 
ferences in  these  machines  may  vary,  however,  from  the  smallest  and  most  insignificant 
detail  to  rate -changing  methods  based  on  entirely  different  principles.  Is  it 
reasonable  to  assume,  therefore,  that  certain  materials  compressed  to  325  wpm  on 
the  rate  changer  produced  by  Company  A  will  be  equally  intelligible  when  compressed 
to  that  same  rate  on  the  machine  produced  by  Company  B?  Should  one  assume  that 
voice  pitch  will  have  the  same  effect  on  materials  processed  by  different  type 
machines?  Might  one  not  speculate  that  machine  A  may  produce  better  results  within 
the  range  of  200-295  words-per-minute  while  machine  B  may  yield  better  results  within 
the  296-380  wpm  range? 

Finally,  a  third, but  significantly  debilitating  factor  has  been  caused  by  two 
conditions : 

1  .  Equipment  for  electronically  or  mechanically  varying  the  rate  of  recorded 
speech  is  presently  both  expensive  and  relatively  in-accessible  to  the 
average  potential  user.  This  fact  is  particularly  insidious  in  effect  to 
basic  researchers  such  as  writers  of  dissertations,  etc.  Because  of  the 
considerable  expense  usually  associated  with  this  type  research,  one  must 
in  effect  convince  potential  sponsors  of  the  imminent  fiscal  efficacy  of 
research  results  before  permission  or  approval  of  such  a  project  will  even 
be  considered.  The  writer  by  no  means  wishes  to  imply  that  the  zealous 
researcher  would  be  daunted  by  such  minor  obstacles.  Neither  does  he 
suggest  that  research  results  might  come  to  be  tainted  by  efforts  to  solicite 


176 


continued  funding.  The  objective  here  is  merely  to  point  out  difficulties 
of  financially  unestablished  investigators  in  this  area. 
Up  to  this  point,  the  writer  has  attempted  to  point  out  a  few  factors  which 
have  rather  systematically  prevented  the  development  of  a  systematic  body  of  data 
or  a  theory  of  Rate  Controlled  Speech.  This  list  is  obviously  not  intended  to  be 
exhaustive.  Additionally,  all  of  the  items  listed  here  have  been  cited  by  several 
and  certainly  noticed  by  all  serious  investigators  or  even  reviewers  of  literature 
in  this  area. 

What  is  proposed  in  the  next  section  of  this  paper  is  to  suggest  steps  which 
could  alleviate  many  of  these  discrepancies  and  which  could  hasten  our  development 
of  a  systematic  and  viable  theory  of  Rate -Controlled  Speech. 

1 .  Standardize  Rate  Controlled  Speech  Research  Reporting  Procedures.  The 
most  cursory  examination  of  literature  reporting  the  results  of  studies  conducted 
in  Rate-Controlled  Speech  will  reveal  a  very  conspicuous  lack  of  uniformity  in 
reporting  procedures.  Because  researchers  in  this  area  have  come  from  such  widely 
varied  backgrounds,  it  appears  that  a  standardized  reporting format  would  be 
desirable.  Such  a  convention  would  enable  us  to  capitalize  on  the  varied  back- 
grounds, interests,  techniques  of  all  researchers  in  the  field  while  at  the  same 
time  ensuring  that  their  findings  could  be  totally  digestible  to  other  researchers. 
Such  an  instrument  might  take  a  form  similar  to  the  following: 
I.   THE  PROBLEM 

State  of  the  Problem 
The  Major  Hypothesis  (or  Hypotheses) 
Definition  of  Terms  Used 
II.   PURPOSE  FOR  STUDY 


177 


III.  RESEARCH  DESIGN 
Subjects 
Materials 

The  Message(s) 
The  Test(s) 
Equipment 

Type  of  Rate  Changing  Equipment  Used 
General  Description  of  Equipment  and  Facility  Lay-Out 
17.  PROCEDURE  FOLLOWED 
V.  RESULTS 
VI.  CONCLUSIONS 

APPENDIX  A.  All  Messages  (Full  text) 

All  Messages  (in  Recorded  Form  as  Administered) 
APPENDIX  B.  All  Test  Instruments  (in  Text  Form) 

All  Test  Instruments  (In  Recorded  Form  if  so  Administered) 
APPENDIX  C.  Raw  Data  of  the  Experiment 

APPENDIX  D.  Technical  Description  of  Equipment  and  Facility  Lay-Out 
with  Photographs  or  Appropriate  Graphic  Representation. 

The  Problem;  It  would  appear  that  in  most  cases,  the  "Problem"  section  of 
reports  has  been  rather  clearly  stated. 
III.  RESEARCH  DESIGN 

Subjects:  It  is  the  writer's  opinion,  that  in  the  majority  of  cases,  this 
variable  also  has  been  well  described  and  controlled  for. 

In  the  area  of  Materials ,  however,  several  suggestions  would  appear  in  order. 
In  many  cases  very  little  specific  information  is  given  relative  to  the  actual  type 


178 


(meaning  kind),  text,  or  format  of  message  materials  used.  Minimally  the  following 
information  should  be  provided: 

a.  a  description  of  basic  message  format 

In  many  cases  only  a  rather  scanty  description  is  given  of  the  message 
format.  Some  possible  variables  which  are  seldom  sufficiently  reported  are: 

(1)  sex  of  person  voicing  the  message(s) 

(2)  frequency  range  of  speaker's  voice 

(3)  accent  or  regional  variation  in  speaker 
(U)  professional  speaker  vs.  non-professional 
(5)  original  word -per -minute  rate 

b.  indication  of  kind  of  information  (descriptive,  explanatory,  etc.) 

c.  an  indication  as  to  relative  level  of  complexity 

d.  a  description  of  processed  format  (indicating  both  degree  of  compression 
and  resultant  word -per -minute  rate) 

The  Test:  A  major  determinant  of  the  results  of  any  study  is  the  test  instrument. 
Ironically,  this  has  been  one  of  the  most  neglected  areas  in  Rate  Controlled  Speech 
research.  A  vast  majority  of  researchers  have  relied  solely  upon  "common  sense" 
or  "mother-wit"  in  the  construction  of  the  all  important  listening  test.  It  is  not 
the  intent  nor  perhaps  within  the  scope  of  this  paper  to  deal  with  the  many  rami- 
fications and  the  importance  of  the  test  instrument.  One  may  find  a  good  treatment 
of  this  subject,  however,  in  the  dissertation  of  Charles  M.  Rossiter,  Jr.,  done  in 
1970,  at  Ohio  University  under  the  direction  of  Carl  H.  Weaver.  Different  experiment 
objectives  will  certainly  dictate  different  kinds  of  test  instruments  as  well  as 
test-item  analyses.  Where  practical  or  appropriate  this  writer  would  suggest  the 
use  of  standardized  tests.  In  cases  where  the  standardized  test  is  not  used,  however, 
at  least  the  following  information  should  be  provided : 


179 


a.  a  careful  analysis  of  the  specific  kind  of  behavior  required  for  correct 
response  to  each  test  item 

Much  more  attention  should  be  paid  to  this  problem.  For  example,  one 
should  be  particularly  careful  of  items  which  require  the  listener  to 
comprehend  or  identify  the  "main  idea."  Might  there  not  be  more  than  one 
main  idea?  Might  it  not  be  possible  that  one  could  "understand"  all  that 
was  said  without  agreeing  with  someone  else  as  to  which  idea  was  the  "main 
idea"?  When  a  listener  "disagrees"  with  the  tester  as  to  which  was  the 
main  idea,  what  effect  does  this  "disagreement"  have  on  final  test  results? 

The  above  represents  only  one  example  of  possible  test  item  problems. 
It  is  only  when  each  item  has  been  carefully  scrutinized  for  such  things 
as  type  of  learning  to  be  tested  and  necessary  response  behavior,  that  one 
can  logically  expect  reliable  individual  test  item  results. 

b.  assessment  of  test  idem-content  validity 

c.  assessment  of  test-item  multiple  correlations 

d.  assessment  of  Kuder-Richardson-20  internal  consistency  measure 

e.  assessment  of  effect  of  prior  knowledge  on  test  items. 

Equipment .  The  field  of  rate  controlled  speech  technology  is  still  relatively 
new.  Technological  developments  can  be  expected  to  emerge  rather  frequently  produc- 
ing improvements  over  previous  rate  changing  machines.  It  is,  therefore,  imperative 
that  specific  information  be  provided  as  to  exact  type  of  equipment  used,  and  when 
appropriate,  condition  of  same.  Additionally,  all  auxiliary  sound  equipment  should 
be  indicated  and  briefly  described . 

Another  often  overlooked  detail  is  a  general  description  of  the  equipment  and 
facility  lay-out.  Such  a  description  should  indicate  whether  loud  speakers  or 
earphones  were  usedj  kind  of  earphones  (if  used);  whether  students  were  divided  by 


180 


booths,  etc.  It  should  be  noteu  that  while  general  descriptions  are  suggested  for 
this  section,  technical  descriptions  should  be  provided  in  the  appropriate  section 
of  the  Appendix. 
IV.  PROCEDURE  FOLLOWED 

It  is  the  writer's  opinion  that  in  this  section  the  experimenter  should  assume 
that  others  will  want  to  replicate  his  study  for  any  number  of  reasons  and  that  in 
reporting,  it  should  be  his  goal  to  facilitate  and  even  encourage  this. 

Because  they  are  either  sufficiently  self-explanatory  or  do  not  appear  to  have 
caused  significant  difficulties,  the  remaining  headings  do  not  warrant  individual 
treaianent  here. 

2.  Development  of  a  Visual  Chart- Type  Instrument  on  Which  Appropriate  Rate 
Controlled  Speech  Variables  Might  be  Plotted.  A  particularly  useful  step  in  the 
development  of  a  theory  of  compressed  speech  would  be  the  production  of  a  visual 
chart  on  which  appropriate  rate  controlled  speech  research  variables  could  be  easily 
and  clearly  plotted.  A  prerequisite  to  plotting  is  the  adoption  of  a  common  conven- 
tion for  converting  reported  data  results  into  "scores."  In  a  field  where  there  is 
a  very  conspicuous  lack  of  standardization  of  measuring,  recording  and  reporting 
techniques,  this  is  in  itself  no  menial  task.  It  is,  however,  well  within  our 
capability.  This  time-consuming  but  not  particularly  complicated  labor  should 
yield  extremely  useful  information.  Additionally  the  resultant  chart  should  serve 
at  least  the  following  purposes: 

a.  Provide  a  means  of  identifying  areas  in  which  additional  research  would 
appear  necessary  and/or  desirable. 

b.  Provide  a  means  of  identifying  incongruent  findings  in  research  for 
possible  further  investigation. 


181 


c.  Provide  a  means  of  identifying  possible  across-the-board  applicability 
of  certain  findings. 

d.  Provide  a  means  of  assisting  other  investigators  in  assessing  the  state 
of  the  art  in  terms  of  thorough  and  systematic  studies  on  rate-controlled 
listening. 

An  interesting  side  note  is  the  fact  that  all  of  the  above  mentioned  items  will 
facilitate  easier  examination  of  all  reported  studies.  This  may  in  turn  suggest 
replication  of  several  of  these  studies  under  more  systematically  controlled  condi- 
tions and  with  more  standardized  reporting  conditions.  It  is  the  writer's  opinion 
that  then  and  only  then  will  a  viable  theory  of  rate -controlled  speech  be  able  to 
emerge . 


Charles  L.  Meadows 


182 


•ll] 

ItH 

H 

! 

'  '  I  !  ! 

;  t      ! 

!  1    I'M 

II 

1    •    1 

•  •  ;  r 

ill! 

Ll  !  ! 

•  j 

■  | 

t- 

;  j 

i  •  i  •      f  j 

1  j    1  1 ; 

■  ■  ■ 

-p 

>  ©    £3 

i  a  c 

18-  J 

i  o   sc 

ll  £ 

...  _  |.j_tr.i 

iitt 

-j— L-] — i 

J...L 
-H- 

i 

! 

-+•- 

.ttp:  li- 

j 4J 

■  f   :   ; 

;r_;;4 

...     j  -  •  • 

.44 

uX-~ 

--.4Xl 

mi 

TT1~' 

4-4 

44 

-i-rf; 
f '  j  ;  ! 

!  |-4 

■    :']■'■ 

j  1 

4 

ft  it 

4-  4!  1 

i  HH 

i  4 

!  i  1  i 

;  ;  '     i 

i.j:  j 

i  -j  ■  1 

ii,. 

:  !     i 

(   ;    1    j 

Mil 

j.  1  -j .   ] 

lill 

i  M         i 

Ii  -  i 

-1  ill 

1  i   ;  ' 

i  i  • 
f-jlj 

Ml: 
I-;--:-] 

4 

■P 

■a 

TJ 
iH 

a* 

■  i 

:■■*{!- 

1:; 4 : 

x: 

1        1 

iLtj     J. 

It  !-j  if 

-f  MM 

14  1 

1  4| 

;  ;  ;     j  ;  ■  ; 

Sri 

j  ;— 1 4 

.L'lii 

4 

m$ 

Iti 

4-1 4 

44j 

■:     I     ■ 

:4.:    i    ..     . 

i;::4i.4l 

.    .    .    .    j    .  _  4-f 

Mir 

it  :' 

:  4  j  ; 

:  44- 

:    ! 

Hfp 

Lji.L  Id: 

1  :: 

t44: 

44 

4:1: 1 

:.:;4::    , 

•p 

JTJ    42 
J  <D     O 
4  O.  CO 

:  o 

?g£ 

;  ;  ; 

■     :  • 

'     ;'  ' 

ftp 

;  •  M 
i  j  1 1 

;  '  !  i 

fill 

fill  11 

|:j    It  J 
<      ill 

li-LL  ! 

Mil 

t  ),..! 

M  4 

■ 

:  El 

i-f- 1  f- 

■  !tt~ 

:  i  : 

.  ..,.  -i-  f- 

;  1  Ll 

ti4 

4-4 

F"-f4" 

ilil 

::Mr 

!.::i 

.      :      .       . 

44 

14 

:  :|7 

[ "  |~ ;  T 

4-i+i-  4+ 

,.     4 

l:'  "14 

M  ■>  M 

15! 

1-4 1 

-L  |  44 

■P 

J*? 

>  o  tc 

OTJ 

lit 

. ;.).; 

j.  .  ;  l 
L  T  ;  . 

jitr-tr 

4  4-fi- 

f-j  4i 

44 

441- 
~i  r  4 

"I'M" 

MM 

;■.;] 

.  .  .  I  .        . 

■<*••"■ 

i  .  .  .  . 

4t~" 

;' :~  i-t 

xi4 
M  ;•] 

.JJ_i.  i" 

ntt-# 

rr  p      rl 

44- 

-  :      ;  -H"'! 

444; 

.j  p-p^ 

.41.1 ;  , 

44. 

:  4t 

. '-  i-4-- 

,~:~H- 

t|p) 

II           ! 

M    4 

I'l  i  -i 

i  U-j 

-| — j — 1 — r  - 

_j  4  f... 

1 1 1 1  i 

1     !  | 

•P 

|  3 

KJ-O 

rl  a  O 

a  aco 
son 

-  ■    ! 

4 :  T- 

'-|  T"* 

. ...  in 

■4-1-1-44 

-Iff  ^-j 

ft-i-T 

.I4  L  , 

:  44: i : 

i  :  i4 

44tf 

iri 

■  "■ 

'  '";"• 

M4 

1     1     1 

I  M  ■ 

f  n 

4zMi 

:"H  "f"' 

4  XXV 
-4-j-f-f- 

44± 

;Li:ii 

:4  4- 

-ml  4- 

444+---+ 

t- -~4- 

.....4  1    - 

MM: 

1  -T 

t# 

4-4-X 

l 

i 

j   | 

— •  -H- 

-H 1- 

-44x_r 

t  i — h " 

4t  H" 

... ,  ,-t 

4'frriiil 

44 

Mi    i      - 

:  14:4  ;■ 

~JZ  I 

■ 

H-T-i- 

-■t-fl-r 

.  ;.. 

■  1 

lJ-4  44- 

.    -.  iL 

44 ' 

ysically 

ndicapped 

Lmary     H-Sch  Adult 

444- 

:.:;.; 

1  j  HE 

;  •  .rf" 

41  U..-4 

J-4-..4444-- 

|f      I         ! 

tiff 

44n 

11 1 1 

444-- 

—-i-4 — 

-4ti 

-f  14+ 

11  : 

I  " : 

i  ft  ■; 

;  l  -1 .1 

1  i  j  t 

Ih] 

H+r  ^f 
44-  4 

1)4      I   j 

|l4ff 

--4    :-  i- 
|     [j    ] 

4  i  4 
I  !  4] 

?;!; 

:  ■      1 

:     . .     i 

! MM 

irl  n 

1!  "444 

-1-             7 

'  ,__j_ 

! 

£££ 

-iqi^i 

OT 

ATT 
TT«WE 
Lueuoiid 

-iqi^T 

ATI 

'    s 
IT* 

09 

UB09H   B( 

jpi 

eou< 

[XB09H 

s 

XXB09H 

eiCBXfla 

■ 

Dia 

IB  3 


Comments  on  the  Use  of  Rate- Controlled  Recordings  to 
Improve  Speech 
by  Sticht,    T.  G. 


Abstract 


A  developmental  model  of  the  relationships  between  auding 
and  reading  is  presented  to  provide  a  conceptual  base  for  the 
use  of  rate  controlled  recordings  to  improve  reading  comprehen- 
sion or  rate.   Literature  is  reviewed  regarding  the  relationship 
among  auding  and  reading,  and  data  are  presented  for  a  task  based 
on  concepts  from  the  developmental  model  which  is  hypothesized  to 
measure  the  process  of  decoding  print  to  internal  language  repre- 
sentations during  simultaneous  auding  and  reading,  and  which  also 
indexes  the  student's  ability  to  store  information  in  a  retriev- 
able manner  (one  index  of  comprehension)  during  the  simultaneous 
auding  and  reading  task. 


185 


Comments  on  the  Use  of  Rate  Controlled  Recordings 
to  Improve  Reading  Skills 

Thomas  G.  Sticht 
Human  Resources  Research  Organization 

One  of  the  aspects  of  rate  controlled  recordings  which  has  intrigued 
many  educators  is  their  potential  for  improving  reading  speed  and  compre- 
hension. Duker  (1974)  presents  several  reports  which  studied  the  effects 
of  training  via  listening  to  various  rates  of  speech  or  in  simultaneously 
listening  and  reading  at  various  rates  of  speech  on  reading  speed  and  com- 
prehension. Despite  the  general  ambiguity  of  research  findings  -  sometimes 
positive,  sometimes  negative,  with  no  real  understanding  why  -  commercial 
publishers  have  produced  complete  sets  of  expensive  instructional  materials 
based  upon  listening  to  rate  accelerated  speech  to  improve  reading  compre- 
hension and  speed  (e.g.,  the  Sack-Youman  Speeded  Tapes  Lab). 

Interestingly,  one  of  the  most  useful  applications  of  rate  controlled 
speech  has  been  to  permit  the  blind  to  listen  at  rates  comparable  to  typical 
silent  reading  rates  of  sighted  high  school  seniors  and  college  freshmen. 
Thus,  the  application  is  based  upon  getting  listening  rates  equivalent  to 
reading  rates. 

When  using  rate  controlled  recordings  to  improve  reading  rate,  however, 
the  interest  is  in  getting  reading  rates  equivalent  to  listening  rates. 
That  is,  the  rate  of  listening  is  used  as  a  pacer  or  target  rate  to  which  it 
is  hoped  reading  rate  will  increase.  Sometimes,  it  is  hoped  that  listening 
rate  itself  will  increase,  and  that,  almost  magically  (since  usually  no  mech- 
anism or  process  is  stated)  reading  rate  will  now  rise  to  the  new  level  set  by 
the  improved  listening  rate. 


186 


Examination  of  the  various  studies  which  have  attempted  to  use  rate 
controlled  recordings  to  improve  listening  or  reading  skills  reveals  very 
little  by  way  of  analysis  of  listening  and  reading  processes  to  suggest  a 
strong  basis  for  the  research.  For  instance,  few  studies  ask:  In  what  re- 
spects are  listening  and  reading  similar,  so  that  training  in  processing 
spoken  messages  at  various  rates  could  transfer  to  processing  printed  mes- 
sages? What  characteristics  do  listening  and  reading  share?  What  charac- 
teristics differentiate  the  processes?  How  do  these  processes  change  de- 
velopmental ly?  How  does  learning  to  read  differ  from  learning  to  listen? 

-  and  so  forth. 

Without  a  conceptual  base  which  will  suggest  answers  to  questions 
such  as  above,  applications  of  rate  controlled  recordings  will  continue 
to  be  done  on  a  completely  hit  and  miss,  empirical  basis.  This  will  re- 
sult in  a  proliferation  of  theses  and  dissertations  -  but  little  knowledge 

-  only  an  assortment  of  contradictory  findings,  unresolveable  and  incompre- 
hensible in  the  aggregate. 

A  Developmental  Model  of  Auding  and  Reading 

As  a  beginning  in  providing  a  conceptual  base  for  the  rational  study 
of  listening  and  reading,  we  are  conducting  research  to  explore  some  of  the 
basic  perceptual,  cognitive,  and  language  factors  involved  in  listening  and 
reading.  In  pursuing  this  work,  we  have  found  it  useful  to  consider  rela- 
tionships among  listening  and  reading  from  a  developmental  perspective  - 
that  is,  with  attention  to  the  chronological  development  of  listening  and 
reading  skills,  including  the  development  of  rates  of  listening  and  reading. 

187 


A  detailed  presentation  of  much  of  the  thinking  we  have  done  in  this  area 
is  contained  in  Sticht,  et  al  (1974).  Here  we  present  only  a  brief  summary 
of  the  developmental  model;  then  we  will  discuss  in  greater  detail  implica- 
tions of  the  model  for  understanding  factors  underlying  rate  of  listening 
and  reading.  Finally,  we  will  present  some  empirical  research  in  which  we 
attempt  to  measure  one  aspect  of  reading  which  may  be  the  target  for  studies 
in  which  simultaneous  listening  and  reading  is  used  to  improve  reading  - 
namely,  the  decoding  of  print  to  language. 

The  Developmental  Model 

Figure  1  presents  the  developmental  model  of  literacy  in  schematic 
form.  Briefly,  the  model  formally  recognizes  what  common  sense  tells  us, 
and  that  is  that  when  a  child  is  first  born  he  or  she  is  born  with  certain 
Basic  Adaptive  Processes  for  adapting  to  the  world  around  them.  These  BAP 
include  certain  information  processing  capacities  for  acquiring,  storing, 
retrieving,  and  manipulating  information.  This  stored  information  processing 
capacity  forms  a  cognitive  content  which,  in  its  earlier  forms,  is  pre- 
linguistic  (Figure  1;  Stage  1).  After  some  time  though,  the  child  develops 
skills  for  receiving  information  representing  the  cognitive  content  of 
others,  and  for  representing  his  own  cognitive  content  to  others.  This  is 
accomplished  through  the  specialization  of  the  information  processing  acti- 
vities of  listening,  looking,  uttering,  and  marking  (Figure  1;  Stage  2). 
The  specialization  is  one  of  use  of  these  skills  for  the  express  purpose 
of  externally  representing  one's  own  thoughts  for  others  to  interpret,  and 
forming  internal  representations  of  the  external  representations  of  others' 


188 


Figure  1     Overview  of  the  Developmental  Model  of  Literacy. 


Stags  1 


Stage  2 


Stsge  3 


StagoA 


c 


Environment 
A 


Development 


of  Languaging 


Languaging    « t 


Basic  Adaptive  Processes 


Sensory-Perceptual 
(Hearing,  Seeing, ) 

A 


J — >■  Precursors  to  Languaging 

|    Listening 

ive    \ 

^Looking 


-*>-  Oracy  ±-  Literacy 


-V-  Auding 


Receptive 


Reading 


'olor  Movement 


J  J    Utterii 

— >7  Productive  ( 
I  ^Markii 


Y 

Cognitive 

(Processing  and  Storj£9 

of  Information) 


Jk  Writing 


Cognitive  Development 
(Content  and  Processes) . 


(Pre-lir.guist:c  content) 
I 


(Linguistic  sub-content) 
I 


189 


thoughts  that  they  make.  More  specifically  though,  the  particular 
specialization  of  present  concern  is  the  representation  of  thoughts  via 
the  use  of  conventionalized  signs  (words)  and  rules  for  sequencing  these 
signs  (syntax)  in  speaking  and  auding  (listening  to  speech  in  order  to 
language).  (Figure  1,  Stage  3.) 

Finally,  if  the  child  is  in  a  literate  society,  he  may  acquire  the 
specialized  looking  and  marking  skills  of  reading  and  writing.  For  present 
purposes,  we  presume  that  we  are  talking  about  the  "typical"  case  in  our 
literate  society,  and  assert  that  children  typically  learn  to  read  and 
write  (Figure  1,  Stage  4). 

A  further  aspect  of  the  developmental  model,  is  that  it  holds  that 
the  development  of  the  oracy  skills  requires  the  development  of  the  cog- 
nitive content  through  intellectual  activity  which  we  call  conceptualizing 
ability.  In  other  words,  the  development  of  the  oracy  skills  of  speaking 
and  auding  follows  and  is  built  upon  a  pre-linguistic  cognitive  content 
and  conceptualizing  ability. 

A  final  aspect  of  the  model  is  that  it  asserts  that  the  literacy  skills 
utilize  the  same  conceptual  base  (cognitive  content;  conceptualizing  ability; 
knowledge)  as  is  used  in  auding  and  speaking,  and  utilize  the  same  signs  and 
rules  for  sequencing  those  signs  as  is  used  in  the  oral  language  skills  for 
receiving  and  expressing  conceptualizations.  Notice  that  this  is  an  assertion 
based  upon  the  developmental  sequence,  i.e.,  the  literacy  skills  are  built 
upon  existing  language  and  conceptualizing  skills  as  the  end  of  a  develop- 
mental sequence.  This  does  not  mean  that  once  literacy  skills  are  acquired, 


190 


that  they  do  not  contribute  anything  new  to  knowledge  or  language  capability; 
clearly  they  do.  What  is  asserted  is  that  when  the  literacy  skills  are  ini- 
tially acquired,  they  are  essentially  to  be  construed  as  a  second  way  of 
utilizing  the  same  language  system  the  child  uses  in  speaking  and  auding. 

Rates  of  Auding  and  Reading 

According  to  the  developmental  model  outlined  above,  auding  and  reading 
utilize  the  same  languaging  and  conceptualizing  systems.  Developmentally, 
one  first  develops  auding  skill,  including  the  ability  to  process  speech 
information  into  language  and  conceptualizations  at  some  adaptive  rate  (or 
rates,  depending  upon  the  task  at  hand).  We  can  speculate  that  this  rate 
will  have  some  upper  limit,  and  that  this  limit  will  reflect  the  limits  of 
languaging  and  conceptualizing  rates.  By  languaging  rate,  we  refer  to  the 
speed  with  which  conceptualizations  can  be  encoded  as  language  forms  (e.g., 
meaningful  morphemes,  or  "words"  for  present  purposes)  for  speaking  to 
others,  and  the  rate  at  which  speech  can  be  recoded  from  the  acoustic  form 
into  an  internal  language  form  (e.g.,  "words")  when  auding  the  speech  of 
others.  By  conceptualizing  rate,  we  mean  the  rate  at  which  concepts  or 
thoughts  can  be  formed  in  order  to  be  expressed  in  language  by  speaking, 
and  the  rate  at  which  language  forms  can  be  recoded  into  thoughts  or  concep- 
tualizations during  auding. 

If,  in  fact,  there  are  limits  on  how  quickly  we  can  recode  speech  into 
language,  and  language  into  conceptualizations  during  auding,  and  if,  as 
stated  in  the  developmental  model,  reading  utilizes  the  same  languaging  and 
conceptualizing  processes  as  used  in  auding,  then  we  expect  that  maximal 

191 


auding  and  reading  rates  will  be  comparable,  following  the  acquisition  of 
skill  in  recoding  printed  language  into  the  internal  representations  of 
language  as  are  formed  during  the  auding  of  speech.  This  hypothesis  follows 
from  the  fact  that,  in  the  present  model,  auding  and  reading  utilize  the 
same  languaging  and  conceptualizing  systems.  Hence,  the  limiting  factors 
underlying  both  auding  and  reading  rate  are  skill  in  languaging  and  in 
conceptualizing. 

Evidence  for  the  Comparability  of  Maximal  Auding  and  Reading  Rates: 
While  the  concept  of  reading  rate  or  "speed  reading"  is  probably  familiar, 
readers  of  this  report  may  not  be  familiar  with  the  concept  of  auding  rate 
or  "speed  auding".  Essentially,  auding  rate  refers  to  how  well  one  can  com- 
prehend spoken  passages  presented  at  different  rates  of  speech.  For  instance, 
a  paragraph  might  be  read  aloud  to  a  listener  at  an  average  rate  of  150  wpm, 
and  a  comprehension  test  administered  immediately.  This  procedure  is  then 
repeated  for  comparable  materials  presented  at  rates  of  200,  250,  300,  and 
350  wpm.  Changes  in  immediate  retention  comprehension  scores  are  used  to 
indicate  the  influence  of  speech  rate  on  auding.  Thus,  "speed  auding"  means 
auding  rapidly-presented  rates  of  speech. 

In  their  1969  review  of  research  on  rate  of  auding,  Foulke  and  Sticht 
concluded  that,  when  various  studies  are  considered  collectively,  the  rela- 
tionship that  emerges  is  one  in  which  rate  of  auding  comprehension  declines 
slowly  as  word  rate  is  increased,  up  to  a  rate  of  some  275  wpm;  beyond  this, 


192 


Note  that  auding  and  reading  are  subsets  of  the  more  general  processes  of 
listening  and  looking,  respectively.  Hence,  confirmation  of  the  present 
hypothesis  is  evidence  for  the  hypothesis  that  listening  and  looking  rates 
are  equal,  as  they  should  be  since  they  are  simply  modality  names  for  one 
internal  process  -  focal  attending.  (See  Sticht,  et  al ,  1974,  pp  43-50.) 


the  decline  in  rate  of  auding  comprehension  is  faster.  Subsequently, 
Foulke  (1971)  reported  data  suggesting  that  rate  of  auding  comprehension 
declined  more  rapidly  when  a  wpm  rate  of  250  was  exceeded.  Carver  (1973b) 
reported  re-analyses  of  Foulke's  (1971)  data  which  indicated  that,  for  very 
difficult  test  items,  auding  comprehension  dropped  off  rapidly  at  300  wpm, 
while  for  less  difficult  items,  auding  comprehension  declined  only  a  little 
over  the  range  of  speech  rates  from  125  to  400  wpm.  In  Figure  5  of  the 
same  article,  Carver  presented  data  of  his  own  indicating  that  subjects' 
judgment  of  how  well  they  understood  spoken  messages  presented  at  various 
rates  dropped  off  gradually  for  speech  rates  from  100  to  300  wpm,  and  then 
declined  rather  rapidly  at  rates  beyond  300  wpm. 

Carver  also  presented  evidence  (Figure  6  of  his  article)  to  suggest 
that  a  "threshold"  for  comprehending  auding  materials  might  be  surpassed  at 
speech  rates  as  low  as  150  wpm,  depending  upon  how  comprehension  is  measured 
(e.g.,  multiple-choice  tests,  judgments  of  understanding).  However,  in  a 
subsequent  unpublished  paper,  Carver  (1973c)  presented  additional  data  to 
suggest  that,  for  college  students,  auding  comprehension  drops  precariously 
when  rates  exceeding  300  wpm  are  presented.  Thus,  although  research  exists 
to  suggest  that  auding  comprehension  may  or  may  not  decline  at  rates  of 
speech  less  than  or  equal  to  250-300  wpm,  evidence  is  strong  for  suggesting 
that  rates  above  these  levels  will  almost  certainly  lead  to  rapid  losses  of 
information  by  auding. 

Regarding  speed  auding,  then,  current  research  indicates  that,  although 
most  information  that  is  presented  for  auding  does  not  demand  processing  rates 
in  excess  of  150-200  wpm  (newscaster;  professional  readers  for  the  blind 

193 


typically  read  aloud  at  around  175  wpm  -  25  wpm,  Foulke  and  Sticht,  1969; 
Foulke,  1969),  high  school  graduates  and  college  students  can  aud  at  rates 
up  to  250-300  wpm  before  their  capacities  for  rapidly  processing  language 
information  are  overtaxed.  If  this  represents  some  upper  limit  in  rate  of 
languaging,  then  the  present  model  predicts  that  once  reading  skill  is 
acquired,  it  will  reflect  this  same  limit  in  rate  of  languaging. 

Data  bearing  on  normative  rates  of  silent  reading  are  available  from 
the  1972  National  Assessment  of  Educational  Progress  (Report  02-R-09). 
This  survey  measured  the  rate  at  which  respondents  aged  9,  13,  17,  and  26 
to  35  (young  adults)  silently  read  materials  with  the  knowledge  that  they  '■; 
would  be  tested  for  comprehension  (memory  for  details)  immediately  afterward. 

Data  from  the  National  Assessment  report  are  summarized  in  Table  1. 
While  a  clear  growth  in  reading  rate  is  evident  from  9-year-olds  to  17-year- 
olds,  there  is  no  evidence  for  silent  reading  rates  in  excess  of  the  250-300 
wpm  reported  previously  for  upper  ranges  of  auding  rates.  For  17-year-olds 
and  young  adults,  only  some  10%  of  the  samples  read  in  excess  of  300  wpm. 
Only  17  people  out  of  the  7850  tested  at  all  age  levels  read  in  excess  of 
750  wpm  -  and  these  readers  could  not  consistently  answer  four  out  of  five 
of  the  comprehension  questions  for  two  selections. 

There  is  little  evidence  here,  then,  that  people  "typically"  read  at 
rates  far  in  excess  of  rates  they  can  contend  with  by  auding.  In  fact,  the 
median  rates  of  silent  reading  for  17-year-olds  and  young  adults  are  not  too 
much  higher  than  the  175  wpm  average  oral  reading  rates  of  professional  news- 
casters and  readers  for  the  blind  (cf . ,  Foulke  and  Sticht,  1969).  It  is  also 


194 


Table  1 


Rate  of  Silent  Reading  for  Four  Age  Groups* 


Age  (years) 


Passage 


Grade  Level 
of  Materials5 


Reading  Rata  at  Percentile0 


25 


50 

Median 


75 


13 


17 


26-35 


2195 


2196 


2220 


1239 


1 

4-8 

85 

117 

158 

2 

1          7-12 

88 

123 

169 

1 

5 

133 

173 

217 

2 

10-11 

128 

'16b 

212 

1 

10 

•      160 

19b 

247 

2 

College 

157 

195- 

246 

1 

10 

145  ' 

..183 

231 

2 

College 

145 

.    186 

236 

aD.3ta  are  from  National  Assessment  of  Educational  Progress  Report  02-R-09:  Reading  Rate 
and  Comprehension,  1970-71  Assessment,  December  1972. 

bGrade  levels  are  readability  scores  determined  by  3  to  4  different  readability  formulas.  Data 
presented  are  ranges. 

cReading  rates  are  words  per  minute  (wpm). 


195 


relevant  to  note  that  trained  oral  readers  can  produce  speech  rates  as 
fast  as  220-344  wpm  when  asked  to  produce  maximal,  yet  intelligible  rates 
of  speech  (Goldstein,  1940;  Carroll,  1968;  Miron  and  Brown,  1971).  These 
rates  of  reading  aloud  are  fast  enough  to  encompass  the  range  of  the  silent 
readers  at  the  75th  percentile  in  Table  1.  They  are  also  within  the  range 
of  silent  reading  rates  for  college  students,  which  are  typically  found  to 
be  in  the  vicinity  of  250-300  wpm  (Gray,  1956;  Carrol,  1968). 

It  appears  that  college  students  typically  read  silently  at  rates  com- 
parable to  those  at  which  auding  can  be  performed,  without  serious  decrements 
in  comprehension.  In  turn,  both  auding  and  reading  rates  of  college  students 
seem  to  correspond  to  the  upper  rates  at  which  oral  reading  can  be  produced. 
This  suggests  a  common  factor  underlying  all  three  processes,  an  idea  we  shall 
return  to  later  in  this  section. 

The  evidence  reviewed  regarding  the  comparability  of  auding  and  reading 
rates  does  not  include  direct  comparisons  of  auding  and  reading.  There  are, 
so  far  as  we  can  determine,  only  a  handful  of  studies  that  make  such  a  direct 
compariosn.  In  an  early  study  of  the  effects  of  rate  of  presentation  of  mes- 
sages on  auding  and  reading  comprehension,  Goldstein  (1940)  presented  spoken 
messages  to  adults  at  100,  137,  174,  211,  248,  285,  and  322  wpm.  He  found 
that  comprehension  scores,  expressed  in  school  grade  equivalents,  decreased 
as  11.1,  10.8,  10.6,  10.5,  9.4,  9.3,  and  8.7,  respectively.  Thus,  increasing 
the  rate  of  presentation  decreased  the  amount  of  information  available  to  be 
used  in  answering  the  comprehension  questions.  The  largest  drop  occurred 
between  211  and  248  wpm,  with  a  decrease  from  10.5  to  9.4  -  a  1.1  grade  level 
drop. 


196 


In  the  same  study,  Goldstein  also  presented  materials  for  reading  at 
different  rates  using  a  moving  picture  projection  technique  to  control  rate 
of  appearance  of  the  printed  text.  For  the  same  rates  (100,  137,  174,  211, 
248,  285,  and  322  wpm)  comprehension  scores  decreased  as  10.6,  10.1,  10.1, 

9.8,  9.4,  9.1,  8.7.  It  should  be  noted  that  the  auding  and  reading  compre- 
hension scores  are  quite  similar,  and  that  both  auding  and  reading  scores 
decrease  with  increasing  rates  of  presentation. 

Jester  and  Travers  (1966)  presented  passages  for  auding  and  reading  at 
rates  of  150,  200,  250,  300  and  350  wpm.  For  auding,  their  college  students 
had  mean  retention  comprehension  raw  scores  of  14.7,  14.2,  7.3,  4.9,  and  5.2 
respectively.  Corresponding  reading  scores  were  15.5,  10.8,  9.1,  10.1,  and 

5.9.  It  is  clear  that  at  the  fastest  rate  (350  wpm)  auding  and  reading  scores 
are  comparable,  while  at  300  wpm,  reading  is  clearly  superior  to  auding  (10.1 
to  4.9).  On  the  other  hand,  auding  surpassed  reading  at  200  wpm.  At  best 
then,  these  data  are  inconclusive.  It  seems  unlikely  that  reading  would  be 
more  effective  than  auding  at  300  wpm,  less  effective  at  200  wpm,  and  equally 
effective  at  150  wpm  -  especially  since  both  Mowbray  (1953)  and  more  recently 
Young  (1973)  found  no  differences  in  college  students'  auding  and  reading  re- 
tention comprehension  scores  when  materials  were  presented  at  175  wpm,  with 
reading  rates  being  paced  by  moving  displays  of  print  as  in  Goldstein's  study. 
Perhaps  discrepancies  between  Goldstein's  work  and  that  of  Jester  and  Travers 
relate  in  some  way  to  the  fact  that  the  latter  researchers  used  slide  projec- 
tion to  present  non-moving  print  displays.  Whatever  the  case,  it  is  clear 
that  at  the  fastest  rate  -  350  wpm  -  Jester  and  Travers  found  auding  and 
reading  performance  to  be  comparable.  Thus,  there  is  no  indication  of  great 

differences  in  rate  of  languaging  favoring  reading. 

197 


Carver  (1973c)  presents  the  most  analytic  discussion  of  the  rela- 
tionship between  auding  and  reading  rates  found  by  these  reviewers. 
He  presented  auding  and  reading  passages  to  108  college  students  at  rates 
ranging  from  75  to  450  wpm.  (Actually,  reading  rate  was  not  directly  man- 
ipulated; rather,  time  for  reading  was  limited  to  the  duration  needed  to 
present  the  passages  for  auding.)   Comprehension  was  measured  using  sub- 
jective judgments  by  subjects  concerning  the  percent  of  thoughts  contained 
in  the  passages  that  they  estimated  they  understood.  This  measure  had  pre- 
viously been  demonstrated  to  be  a  valid,  reliable  method  of  measuring  com- 
prehension (Carver,  1973a).  Results  of  Carver's  work  indicated  that,  con- 
sistent with  the  developmental  model,  both  auding  and  reading  rates  were 
optimal  around  250-300  wpm. 

To  summarize  briefly,  research  reviewed  above  indicates  that: 

(a)  Typical  oral  reading  rates  for  professional  oral  readers  (newsmen; 
readers  for  "talking  books"  for  the  blind)  are  around  175  wpm,  with  a  stan- 
dard deviation  of  25  wpm,  hence  auding  rates  of  175  wpm  are  typical  for 
persons  auding  such  presentations. 

(b)  A  national  sample  of  17-year-olds  and  young  adults  silently  read  at 
rates  of  185-195  wpm,  suggesting  that,  typically,  such  persons  do  not  read 
silently  much  faster  than  they  aud  newscasts  or  radio  programs. 

(c)  When  requested  to  read  aloud  as  rapidly  as  possible  without  loss  of 
intelligibility,  trained  oral  readers  can  produce  speech  rates  as  high  as 
250-340  wpm. 


Recently,  Carver  has  completed  similar  research  in  which  reading  rate  was 

directly  manipulated  by  use  of  moving  picture  projections  of  the  printed 

page.  This  work  has  confirmed  the  conclusion  of  his  previous  work  which 
is  discussed  herein. 


198 


(d)  When  adults  are  presented  spoken  materials  for  rapid  auding,  compre- 
hension typically  holds  up  well  for  speech  rates  up  to  250-300  wpm,  then 
declines  more  rapidly. 

(e)  A  national  sample  of  17-year-olds  and  adults  showed  less  than  10%  of 
the  population  reading  above  300  wpm,  with  the  75th  percentile  reading  at 
231-247  wpm;  additional  studies  indicate  that  high  school  students  and 
college  students  -  that  is,  the  better  readers  in  the  country  -  typically 
read  at  rates  of  250-300  wpm. 

(f)  Studies  which  have  directly  compared  the  effectiveness  of  auding  and 
reading,  at  different  rates  of  presentation  of  the  material  up  to  350  wpm, 
show  comparable  levels  of  comprehension  for  the  two  processes  at  the  fastest 
rates. 

From  the  foregoing  we  conclude  that,  to  date,  there  is  no  clearly 
demonstrated  superiority  for  the  reading  process  in  rate  of  processing  lan- 
guage information  from  print  over  what  can  be  accomplished  by  the  auding 
process  in  processing  language  information  from  speech.  Rather,  the  avail- 
able data  suggest  that  both  auding  and  reading  processes  may  operate  at  the 
same  rates  of  efficiency  when  the  rate  of  presentation  of  langauge  material 
is  directly  .manipulated.  This  conclusion  is  consistent  with  the  assertion 
in  the  developmental  model  that  reading  utilizes  the  same  languaging  capa- 
bilities as  auding.  Hence,  the  rate  at  which  languaging  can  be  executed 
limits  both  the  rate  of  auding  and  subsequently  the  rate  of  reading  when 
that  skill  is  acquired. 


199 


Speculation  On  The  Rate  of  Languaging:  It  is  of  interest  to  note  that 
the  rates  of  250-300  wpm,  indicated  by  the  foregoing  as  more-or-less  "maximal" 
rates  for  auding  and  silent  reading,  correspond  closely  to  the  fastest  rates 
at  which  trained  readers  can  read  aloud.  This  suggests  that  the  same  factors 
which  limit  rates  of  reading  aloud  may  limit  rates  of  auding  and  reading. 
One  factor  limiting  oral  reading  is  the  rate  at  which  articulatory  movements 
can  be  made.  Lenneberg  (1967,  pp  88-124)  discusses  various  aspects  of  speech 
production,  including  the  rate  at  which  articulatory  movements  (syllables) 
can  be  made.  He  reports  that  "...  subjects  between  the  ages  of  eight  to 
about  thirty  could  speed  up  production  to  eight  and  occasionally  even  nine 
syllables  per  second  for  the  duration  of  a  few  seconds;  the  rate  slowed  down 
to  about  six  per  second  if  the  alternating  movements  were  to  be  sustained 
over  more  than  three  or  four  seconds."  (p.  115) 

Taking  six  syllables  per  second  as  an  efficient  level  of  production 
gives  360  syllables  per  minute.  Then,  assuming  1.42  syllables  per  word 
(the  average  for  33  of  the  36  passages  scaled  for  complexity  by  Miller  and 
Coleman,  1967;  Carroll,  1967  describes  six  passages  with  an  average  of  1.44 
syllables  per  word),  we  obtain  a  rate  of  254  wpm  -  a  rate  comparable  to  the 
average  silent  reading  rate  of  high  school  students  (Carrol,  1968).  A  rate 
of  300  wpm  corresponds  to  a  syllable  per  second  rate  of  7.1,  midway  between 
Lenneberg's  rates  of  six  syllables  per  second  for  sustained  production,  and 
nine  syllables  per  second  for  brief  durations  of  production. 

There  appears,  then,  to  be  a  close  relationship  between  the  rate  at 
which  syllables  can  be  produced,  and  maximal  auding  and  silent  reading  rates. 
It  is  as  though,  typically,  auders  and  readers  utilize  the  same  mechanisms 


200 


for  decoding  spoken  or  printed  language  into  conceptualizations ,  as  are  used 
in  signaling  conceptualizations  to  others  via  speech. 

This  is,  of  course,  an  old  idea.  Huey  (1908;  reprinted  in  1968)  devotes 
two  chapters  to  the  role  of  "inner-speech"  in  reading.  He  states:  "The 
simple  fact  is  that  the  inner  saying  or  hearing  of  what  is  read  seems  to  be 
the  core  of  ordinary  reading,  the  'thing  in  itself,  so  far  as  there  is  such 
a  part  of  such  a  complex  process."  (p.  122)  While  elsewhere  Huey  states  that 
the  fact  of  inner  speech  forming  a  part  of  silent  reading  has  not  been  dis- 
puted (p.  117),  Kolers,  in  his  introduction  to  the  1968  printing  of  Huey's 
book,  expresses  the  kind  of  ideas  that  have  obscured  the  relationship  between 
languaging  and  auding  and  reading  when  he  states  that:  "People  who  read 
faster  than  about  three  or  four  hundred  words  per  minute,  and  certainly  those 
who  read  at  rates  of  a  few  thousand  words  per  minute,  simple  have  not  enough 
time  to  form  an  auditory  representation  of  all  they  read."  (p.  xxvii) 

Of  course,  Kolers  gives  no  data  to  indicate  that  people  can  read  a  "few 
thousand  words  per  minute".  In  fact,  Taylor  (1962)  presents  eye  movement 
records  which  clearly  indicate  qualitative  differences  between  "normal" 
reading  and  "reading"  at  3000  or  more  wpm.  The  latter  recordings  indicate 
that  the  "rapid  reading"  eyes  move  in  a  completely  different  manner  than  do 
the  "normal  reading"  eyes.  The  latter  move  systematically  to  the  right 
across  a  line  of  print  making  three  or  four  stops  (fixations),  and  then  make 
a  return  sweep  to  the  left  margin  and  begin  to  move  to  the  right  again.  The 
"rapid  reading"  eyes,  on  the  other  hand,  may  move  down  the  left  margin  for 
10  lines  or  so,  then  back  to  the  left,  and  so  on,  quite  clearly  doing  some- 
thing other  than  "normal  reading". 

201 


Thus,  while  "skimming"  or  "scanning"  can  most  certainly  be  accomp- 
lished with  printed  displays,  there  is  little  evidence  that  readers  can, 
or  typically  do,  read  at  rates  far  above  the  rates  at  which  they  can  aud 
or  speak  (see  Edfeldt,  1960;  Sikolov,  1972,  pp  202-211,  for  further  dis- 
cussion and  research  on  inner  speech  and  reading;  Carver,  1971a  for  dis- 
cussion of  "speed  reading"). 

The  upshot  of  this  analysis  is  that  much  of  silent  reading  appears 
to  involve  the  conversion  of  printed  symbols  into  the  same  type  of  signing 
systems  used  in  receiving  and  expressing  oral  symbols,  which  are  then  con- 
verted into,  or  directly  give  rise  to,  conceptualizations.  Thus,  the  rep- 
resentation of  meaning  directly  by  written  language  does  not  appear  to  be 
a  typical  happening,  as  some  have  argued  is  the  case  with  skilled  readers 
(Goodman,  1973;  Smith,  1971,  pp  44-45  -  again,  we  see  here  the  claim  that 
"  .  .  .  trained  readers  can  cover  [but  not  read  one  by  one]  many  thousands 
of  words  in  a  minute"  with  no  evidence  given,  and  with  a  failure  to  care- 
fully distinguish  reading  from  skimming  or  scanning). 

The  fact  that  the  maximal  rates  of  syllable  production  closely  match 
the  optimal  auding  and  reading  rates  should  not  be  taken  to  necessarily 
imply  the  syllable  as  the  "basic"  unit  of  language.  It  may  be,  but  there 
are  many  problems  in  adequately  defining  syllables  (Shuy,  1969)  both  as 
units  of  speech  and  as  units  of  print.  For  present  purposes,  it  is  suffi- 
cient to  note  the  similarities  among  rate  of  syllable  production  (movement 
of  articulators),  rapid  auding,  and  rapid  reading,  and  to  point  out  the 
relevance  of  this  observation  to  the  developmental  model. 


202 


Speculation  on  the  Rate  of  Conceptualizing:  Lenneberg  (1967,  p.  90) 
points  out  that,  while  most  adults  are  capable  of  producing  common  phrases 
or  cliches  at  rates  up  to  500  syllables  per  minute,  more  frequently  they 
speak  at  210  or  220  syllables  per  minute  (150  wpm).  He  then  states:  "Appar- 
ently, the  most  important  factor  limiting  the  rate  of  speech  involves  the 
cognitive  aspects  of  language  and  not  the  physical  ability  to  perform  the 
articulatory  movements.  We  may  not  be  able  to  organize  our  thoughts  fast 
enough  to  allow  us  to  speak  at  the  fastest  possible  rate." 

It  is  likewise  possible  that  in  auding  and  reading  we  may  not  be  able 
to  merge  the  thoughts  being  presented  with  our  own  conceptual  base  fast 
enough  to  "track"  the  oral  or  printed  message.  Possibly,  it  is  primarily 
lack  of  conceptualizing  time  which  causes  the  gradual  loss  in  comprehension 
when  auding  and  reading  speech  are  increased  up  to  250-300  wpm.  Beyond  300 
wpm  then,  the  loss  in  comprehension  may  reflect  both  lack  of  conceptualizing 
time  and  inability  to  mobilize  inner  articulatory  patterns  rapidly  enough  to 
faithfully  follow  the  message. 

Evidence  that  ability  to  rapidly  conceptualize  is  related  to  ability 
to  comprehend  rapid  rates  of  speech  is  available  in  a  study  by  Friedman  and 
Johnson  (1969).  They  administered  a  group  of  cognitive  tests  to  college 
students  who  also  auded  materials  presented  at  175,  250,  325,  or  450  wpm. 
One  of  the  cognitive  tests  -  the  Best  Trend  Name  Test  -  requires  students  to 
infer  the  semantic  relationships  among  a  set  of  words.  For  example,  the 
words  "horse-pushcart-bicycle-car"  are  presented  and  the  student  is  asked  to 
decide  whether  the  relationship  among  the  four  terms  is  best  described  as 
one  of  "speed",  "time",  or  "size".  The  correct  answer  is  "time"  since  the 

203 


sequence  describes  an  order  of  historical  development;  horses  were  the 
earliest  means  of  transportation,  cars  the  most  recent. 

Results  of  multiple  regression  analyses  for  predicting  auding  ability 
at  each  of  the  four  rates  listed  indicated  that  while  the  Best  Trend  Name 
Test  was  a  poor  predictor  of  performance  at  the  slowest  rates,  its  corre- 
lation and  beta  weight  increased  significantly  with  the  fastest  rate  of 
speech,  identifying  it  as  a  major  source  of  individual  variance  in  the  com- 
prehension of  highly  accelerated  speech.  Thus,  the  ability  to  efficiently 
conceptualize  semantic  relations  among  vocabulary  items  facilitates  compre- 
hension of  more  rapid  rates  of  speech. 

The  role  of  conceptualizing  ability  in  comprehending  auding  materials 
is  also  demonstrated  by  the  fact  that,  even  at  rates  of  speech  of  from  125 
to  175  wpm,  high  aptitude  men  do  not  learn  as  much  from  materials  written 
at  grade  level  14.5  or  8.5  as  they  do  from  materials  of  grade  5.5  difficulty 
(Sticht,  1972).  Thus  the  effects  of  difficulty  level  of  material  appear  to 
represent  conceptualizing  rather  than  languaging  (encoding  and  decoding  con- 
ceptualizations into  and  out  of  forms  for  communication)  difficulties  at 
nromal  rates  of  presentation,  although  research  does  not  rule  out  the  possi- 
bility that  higher  grade-level  materials  may  be  more  difficult  to  encode  and 
decode  for  some  individuals. 

The  role  of  conceptualization  ability,  or  ability  to  "organize  our 
thoughts",  in  comprehending  auding  messages  presented  at  various  rates  is 
also  evidenced  by  the  differences  in  performance  between  "high"  and  "low" 
aptitude  students.  Sticht  (1972)  found  that  men  of  low  verbal  aptitude  did 


204 


not  learn  as  much  auding  fifth-grade  materials  presented  at  150  wpm  as 
high  verbal  ability  men  did  at  350  wpm.  In  another  study  (Sticht,  1968) 
it  was  found  that  low  verbal  ability  men  learned  passages  of  6th,  7th,  and 
14th  grade  level  of  difficulty  as  well  by  auding  as  they  did  by  reading 
when  materials  were  presented  at  175  wpm,  but  in  neither  case  did  they  do 
as  well  as  higher  verbal  aptitude  men.  Thus,  "low  aptitude"  or  "low  verbal" 
intelligence  seem  more  likely  to  represent  conceptualization  problems  than 
problems  associated  with  rapid  encoding  or  decoding  of  concepts  into  lan- 
guage to  send  or  receive  ideas. 

The  point  we  are  making  is  that  performance  on  immediate  tests  of  re- 
tention of  information  typically  used  to  evaluate  auding  and  reading  ability 
at  various  rates  of  presentation  reflects  a  combination  of  the  ability  to 
encode  and  decode  information  from  the  conceptual  base  into  or  out  of  spoken 
or  printed  representations  of  our  concepts,  and  the  ability  to  formulate  and 
reformulate  concepts  in  keeping  with  the  message  being  sent  (speaking)  or: 
received  (auding  or  reading).  Other  things  being  equal,  the  former  ability 
will  interfere  with  performance  when  rates  of  information  display  exceed 
300  or  so  words  per  minute,  while  the  latter  ability  will  hinder  or  facili- 
tate performance  over  all  ranges  of  rates  of  presentation,  and  can  be  demon- 
strated by  manipulating  the  difficulty  levels  of  materials  and  the  "mental 
aptitude"  of  the  students.  We  are  inclined  at  the  moment  to  call  the  former 
a  languaging  problem,  and  the  latter  a  conceptualizing  problem. 


205 


Measuring  Aspects  of  Languaging  and  Conceptualizing 
During  Simultaneous  Auding  and  Reading 

We  opened  this  discussion  of  auding  and  reading  by  indicating  that 
there  has  been  interest  in  improving  reading  skill  by  having  students  read 
passages  while  they  simultaneously  aud  the  passage.  By  increasing  the  rate 
of  presentation  of  the  auding  passage  -  generally  through  the  use  of  time 
compression  devices  -  the  attempt  is  made  to  increase  the  rate  of  reading. 

As  we  have  indicated  above,  the  auding  and  reading  processes  contain 
both  languaging  and  conceptualizing  components.  It  is  of  interest  to  know 
which  aspects  of  the  reading  process  might  be  affected  by  practice  in  simul- 
taneous auding  and  reading.  For  example,  one  aspect  of  languaging  by  read- 
ing is  the  decoding  of  printed  words  into  the  language  forms  used  in  the 
spoken  language.  This  type  of  decoding  training  is  found  in  phonics  pro- 
grams in  which  grapheme-phoneme  correspondences  are  taught.  This  type  of 
decoding  skill  apparently  becomes  overlearned  in  skilled  readers,  until  it 
is  completely  unconscious,  or  automatic.  Of  course,  the  skill  can  be  used 
consciously  whenever  a  difficult  word  is  encountered  -  e.g.,  sphygmomanometer 
-  and  we  revert  to  "sounding  it  out". 

One  outcome  of  simultaneous  auding  and  reading  training  then,  might  be 
to  help  "automatize"  the  decoding  component  of  reading.  The  use  of  faster 
and  faster  speech  rates  might  provide  practice  in  more  rapid  decoding  and 
facilitate  the  automatization  process. 


206 


We  could  also  argue  that,  a  person  who  has  great  skill  in  decoding 
print  to  internal  language  forms  should  be  able  to  perform  such  a  task 
at  faster  rates  than  a  relatively  unskilled  person.  Thus,  if  we  presented 
simultaneous  auding  and  reading  passages  at  faster  and  faster  speech  rates, 
we  might  conclude  that  the  person  who  can  store  the  greater  amount  of  infor- 
mation and  use  it  later  on  to  answer  questions  is  the  more  skillful  decoder 
of  print  to  internal  language.  But,  of  course  this  would  be  an  improper 
conclusion,  because  the  person  might,  in  fact,  ignore  the  printed  message 
and  simply  aud  the  spoken  message.  Because  all  of  the  information  is  in  the 
auding  message,  a  person  might  be  quite  unskilled  at  reading/decoding,  but 
quite  skilled  at  auding  and  do  well  on  retention  tests  of  comprehension. 

In  another  case,  however,  the  person  low  in  reading/decoding  skills 
who  is  also  low  in  oral  language  and  conceptualizing  skills,  would  do  poorly 
on  a  retention  test  even  if  he  did  ignore  the  print  and  attend  only  to  the 
spoken  message.  But,  if  we  have  only  the  immediate  retention  test  data,  we 
cannot  tell  if  poor  performance  reflects  poor  reading/decoding,  poor  oral 
language/conceptualizing  skills,  or  both. 

In  order  to  better  understand  the  effects  of  training  in  simultaneous 
auding  and  reading  on  the  improvement  of  reading  skill,  we  are  exploring 
techniques  for  assessing  the  reading-decoding  and  language/conceptualizing 
components  during  simultaneous  auding  and  reading.  One  technique  we  are 
exploring  is  described  below. 


207 


Detection  of  Spoken  and  Printed  Word  Mismatches  During  a  Simultaneous 
Auding  and  Reading  Task:  Our  interest  in  this  task  is  to  obtain  an  indica- 
tion of  a  person's  skill  in  performing  the  reading-decoding  process  during 
simultaneous  auding  and  reading.  The  procedure  we  are  exploring  has  been 
tried  out  with  children  at  the  fifth  grade  level*  and  with  adults  of  low 
and  high  reading  skills.  Here,  I  will  describe  procedures  and  data  obtained 
with  adults  which  demonstrates  that  the  technique  does  seem  to  provide  a 
measure  of  reading-decoding  skill  (or  automaticity  of  decoding ,  as  it  will  be 
referred  to  below). 

The  subjects  of  this  research  were  four  groups  of  adults;  two  groups 
having  high  reading  ability  (HRA)  and  two  groups  having  low  reading  ability 
(LRA).  The  HRA  adults  were  college  students  or  out-of-school  young  men. 
Reading  scores  were  11th  grade  level  or  higher.  The  LRA  adults  were  young 
men  in  a  military  literacy  program  with  reading  grade  levels  below  the  6.0 
grade  level. 

To  assess  the  automaticity  of  decoding  during  simultaneous  auding  and 
reading,  one  group  of  HRA  and  LRA  adults  were  presented  a  2800-word  selection 
from  a  fifth  grade  version  of  Roland  and  Charlemagne  to  be  simultaneously 
auded  and  read.  Then  we  arranged  that  at  times  during  the  presentation,  there 
would  occur  a  different,  though  semantically  appropriate,  word  in  the  spoken 
message  than  that  which  appeared  on  the  printed  page.  For  instance,  the 
printed  story  might  state  "With  the  air  of  a  lord  he  walked  .  .  .  ",  while 
the  spoken  story  would  state  "With  the  air  of  a  prince   he  walked  .  .  .  ". 
When  subjects  encountered  a  mismatch,  they  circled  the  printed  word  which  did 
not  match  the  spoken  word.  Following  this  procedure  then,  in  order  to  perform 


208 


the  mismatch  detection  task,  the  subjects  had  to  continually  decode  the 
print  into  a  form  comparable  to  the  spoken  word,  and  perform  an  internal 
comparison.  To  determine  different  levels  of  skill  in  tracking  the  message 
and  performing  this  mismatch  detection  task,  the  audio  tapes  were  time- 
compressed  to  produce  speech  rates  of  228  and  328  words  per  minute,  while 
the  uncompressed  rate  was  128  wpm. 

To  gain  additional  evidence  that  the  "tracking"  task  described  above 
(detecting  mismatches  between  aural  and  visual  words)  does  indeed  involve 
continuous  decoding,  a  second  set  of  HRA  and  LRA  adult  groups  were  presented 
a  second  version  of  the  same  material.  But,  in  this  case,  the  mismatch  word 
was  replaced  on  the  printed  page  by  three  words  (see  example),  one  of  which 
matched  the  word  in  the  spoken  message.  In  this  case,  the  subjects'  task 
was  circling  the  matching  words. 

prince 
Example:  With  the  air  of  a  king   he  walked  .  .  . 

lord 

With  such  an  arrangement,  the  subject  is  able  to  skip  a  lot  of  the 
decoding  required  in  the  former  task,  because  he  has  a  cue  as  to  where  his 
next  decision  must  be  made.  We  refer  to  this  version  of  the  tracking  task 
as  the  "cued"  version,  while  the  first  version  is  called  the  "uncued"  tracking 
task.  The  "cued"  version  is  also  referred  to  as  a  low  decoding  demand  task, 
while  the  "uncued"  tracking  task  is  a  high  decoding  demand  task. 

In  both  the  high  and  low  decoding  tasks,  the  first  third  of  the  story 
was  presented  at  128  wpm,  the  second  third  at  228,  and  the  final  third  at 
328  wpm.  After  each  third  of  the  selection,  15  four-alternative  multiple 
choice  questions  were  administered  to  the  subjects.  All  questions  called  for 

209 


retention  of  detail  -  no  inference  or  reasoning  items  were  included. 
These  tests  thus  provided  immediate  retention  indicators  of  comprehension. 

Figure  2  presents  the  results  of  the  studies.  Part  A  presents  the 
"tracking"  task,  in  which  one  of  the  three  alternatives  (low  decoding)  or 
the  printed/spoken  word  mismatch  (high  decoding)  was  circled  during  the 
presentation  of  the  message  for  simultaneous  auding  and  reading.  Part  B 
presents  the  immediate  retention  data. 

Of  major  interest  is  the  difference  between  the  curves  for  the  low  and 
high  decoding  tasks  in  the  tracking  data  (Part  A).  At  the  128  wpm  rate,  in 
the  low  decoding  task,  both  low  and  high  reading  ability  people  performed 
practically  100%  correct.  Under  the  high  decoding  conditions,  however,  the 
low  reading  ability  people  scored  only  60%  correct,  while  the  high  reading 
ability  people  maintained  almost  perfect  performance.  With  the  faster 
speech  rates,  under  the  low  decoding  condition,  the  high  reading  ability 
people  maintained  almost  perfect  "tracking"  performance,  while  a  systematic 
decrease  is  observed  for  the  low  reading  ability  people.  Also,  within  each 
speech  rate,  there  is  a  systematic  difference  between  the  low  and  high  de- 
coding conditions,  with  the  latter  always  lower  than  the  former.  At  the 
faster  speech  rates,  even  the  high  reading  ability  people  show  a  drop  in 
their  tracking  performance  under  the  high  decoding  condition. 

We  interpret  the  tracking  (Part  A)  data  of  Figure  2  as  indicative  of 
a  person's  skill  in  performing  the  reading-decoding  process  during  simul- 
taneous auding  and  reading.  Skill  is  indexed  in  two  ways:  being  able  to 
maintain  a  high  level  of  performance  across  low  and  high  decoding  tasks, 


210 


:r 

& 

o 

5. 

t— i 

£ 

Q) 

ijj 

•1->     00 

i— 

<o     C\J 

LU 

OC      CM 

a: 

x: 

o 

a 

CO 

01 

tpaaJOQ  ^U33U9d 


Figure  2  Part  A,  Tracking  presents  mean  percent  correct  scores  for  the 
detection  of  mismatches  between  spoken  and  printed  messages 
for  cued  (low  decoding)  and  uncued  (high  decoding)  conditions 
at  three  speech  rates  for  high  and  low  reading  ability  adults. 
Part  B  presents  mean  percent  correct  scores  for  immediate 
Retention  tests  for  the  same  conditions  and  subjects. 
(See  text,  page  24  for  full  explanation  of  conditions.) 


211 


and  being  able  to  maintain  a  high  level  of  performance  across  all  speech 
rates.  The  reading-decoding  task  is  most  difficult  when  each  word  in  the 
printed  page  must  be  compared  to  each  word  in  the  spoken  message  (high  de- 
coding task)  when  the  latter  is  rapidly  presented  at  328  words  per  minute. 
Under  this  condition,  even  highly  skilled  readers  show  a  large  (40%)  decre- 
ment in  performance. 

The  immediate  retention  data  (Part  B  of  Figure  2)  indicate  that  the 
high  reading  ability  people  had  no  trouble  in  performing  the  tracking  task 
and  storing  sufficient  information  from  the  message  to  be  able  to  respond 
better  than  80%  correct  across  all  three  speech  rates  and  under  both  decod- 
ing levels.  Apparently  this  fifth  grade  material  is  well  within  the  lan- 
guaging  and  conceptualizing  capabilities  of  these  highly  skilled  readers. 

For  the  poorer  readers,  however,  increasing  the  speech  rate  produced 
a  systematic  decline  in  the  amount  of  information  which  was  stored  in  a 
retrievable  manner  during  the  simultaneous  auding  and  reading  task,  though 
the  effects  of  decoding  level  were  inconsistent  for  some  unknown  and  uninterper- 
able  (by  us)  reason.  The  fact  that,  at  the  128  wpm  rate,  the  low  reading 
ability  people  performed  at  a  fairly  high  level  (almost  as  high  as  the 
high  ability'  people  -  70%  compared  to  88%)  suggests  that  the  message  was 
well  within  their  languaging  and  conceptualizing  knowledge,  but  the  30%  or 
so  decrease  in  performance  when  the  speech  rate  was  increased  to  228  and 
then  328  wpm  suggests  a  lack  of  skill  in  processing  the  language  informa- 
tion and/or  forming  conceptualizations  from  that  information  in  such  a 
way  as  to  store  much  of  it  in  a  retrievable  manner.  This  happened  even 


212 


though  their  tracking  scores  dropped  to  such  low  levels,  particularly 
under  the  high  decoding  task,  as  to  suggest  that  they  may  have  ignored 
much  of  the  reading  task  and  instead  attended  to  the  auding  message.  This 
would  have  permitted  them  to  process  at.  least  some  of  the  message  for  subse- 
quent retrieval. 

Summary 

One  of  the  applications  of  rate  compressed  speech  which  has  been  explored 
is  the  use  of  simultaneous  auding  (listening)  and  reading  to  improve  reading 
speed  and/or  comprehension.  It  was  pointed  out  that  most  of  this  work  is  con- 
tradictory in  its  findings  reflecting,  at  least  partly,  we  felt,  a  lack  of 
conceptualizing  regarding  the  auding  and  reading  processes  and  their  relations. 

For  this  reason,  we  have  been  examining  simultaneous  auding  and  reading 
tasks  within  the  context  of  a  developmental  model  of  reading.  Among  other 
things,  the  model  leads  to  the  hypothesis  that  maximal  auding  and  reading 
rates  will  probably  be  the  same,  once  fully  developed  reading-decoding  skills 
are  acquired.  This  is  so  because  reading  utilizes  the  same  language  base 
(lexicon;  syntax)  and  the  same  conceptual  base  (semantic  memory)  as  used  in 
auding.  Hence,  rates  at  which  conceptions  can  be  formed  from  language,  and 
rates  at  which  language  can  be  formed,  ought  to  set  upper  limits  to  both  aud- 
ing and  reading. 

We  speculated  that  simultaneous  auding  and  reading  training  might  serve 
to  improve  the  rate  of  formation  of  language  from  printed  displays  (decoding 


213 


of  print  to  internal  "spoken"  language)  and/or  it  might  improve  the  rate 
of  transforming  language  into  conceptualizations.  While  it  is  difficult 
to  separate  these  processes  operationally,  we  present  data  herein  (Figure  2) 
obtained  using  a  procedure  which  we  think  provides  a  measure  of  reading- 
decoding  skill  during  the  simultaneous  auding  and  reading  process.  It  is 
our  hope  that  this  research  will  lead  to  conceptually  and  procedurally 
sound  evaluation  methods  for  assessing  the  skills  which  might  be  affected 
in  a  simultaneous  auding  and  reading  task. 


FOOTNOTE 

This  research  was  performed  at  HumRRO's  Western  Division, 
P.O.  Box  5787,  Presidio  of  Monterey,  California.  Portions 
of  the  research  were  supported  by  Air  Force  Human  Resources 
Laboratory/Technical  Training  Contracts  F41609-75-C0014  and 
F41609-73-C0025,  with  Dr.  James  R.  Burkett  as  the  technical 
monitor.  I  am  appreciative  of  comments  on  the  paper  pro- 
vided by  Larry  Beck  of  HumRRO's  Western  Division,  and  of 
the  skillful  work  on  the  manuscript  by  Maurlaine  Jorgenson. 
This  paper  is  essentially  an  integration  of  work  previously 
presented  in  Sticht,  et  al  (1974)  and  Sticht  (1975). 


214 


REFERENCES 

Carroll,  J.B.  "Problems  of  Measuring  Speech  Rate",  in  E.Foulke  (ed.) 

Proceedings  of  the  Louisville  Conference  on  Time  Compressed  Speech , 
University  of  Louisville,  Louisville,  KY,  1967. 

Carroll,  J.B.  Development  of  Native  Language  Skills  Beyond  the  Early  Years 
(Research  Bulletin)  Educational  Testing  Service,  Princeton,  NJ,  1968 

Carver,  R.P.  "Understanding,  Information  Processing,  and  Learning  From  Prose 
Materials",  Journal  of  Educational  Psychology,    vol.  64,  1973a,  pp  76-84. 

Carver,  R.P.  "Effect  of  Increasing  the  Rate  of  Speech  Presentation  Upon 
Comprehension",  Journal  of  Educational  Psychology ,  vol.  65,  1973b, 
pp  118-126. 

Carver,  R.P.  Optimal  Information  Storage  Pate  for  Reading  Prose,   Unpublished 
Manuscript,  American  Institutes  for  Research,  1973c. 

Duker,  S.  Time-Compressed  Speech:  An  Anthology  and  Bibliography  in  Three 
Volumes,   Scarecrow  Press,  Metuchen,  NJ,  1974. 

Edfeldt,  A.W.  Silent  Speech  and  Silent  Reading,   University  of  Chicago  Press, 
Chicago,  1960. 

Foulke,  E.  The  Comprehension  of  Rapid  Speech  by  the  Blind:     Part  III,   Final 
Progress  Report  on  Cooperative  Research  Project  24-30,   University  of 
Louisville,  Louisville,  KY,  September,  1969. 

Foulke,  E.  "The  Perception  of  Time  Compressed  Speech",  in  D.Horton  and 
J.Jenkins  (eds.)  The  Perception  of  Language,   Charles  E.  Merril, 
Columbus,  OH  1971. 

Foulke,  E.  and  Sticht,  T.G.  "A  Review  of  Research  on  the  Intelligibility  and 
Comprehension  of  Accelerated  Speech",  Psychological  Bulletin,   vol.  72, 
1969,  pp  50-62. 

Friedman,  H.L.  and  Johnson,  R.I.  Time-Compressed  Speech  as  an  Educational 
Medium:  Studies  of  Stinulus  Characteristics  and  Individual  Differences 
(Report  No.  R69-14)  American  Institutes  for  Research,  Silver  Spring, 
MD,  1969. 

Goldstein,  H.  Reading  and  Listening  Comprehension  at  Various  Controlled  Rates j 
Doctoral  Dissertation,  Teachers  College,  Columbia  University,  1940. 

Goodman,  K.S.  "The  13th  Easy  Way  to  Make  Learning  to  Read  Difficult:  A  Reac- 
tion to  Gleitman  and  Razin",  Reading  Research  Quarterly,   vol.  8,  1973, 
pp  484-493. 


215 


Gray,  W.S.  The  Teaching  of  Reading  and  Writing,   Scott,  Foresman,  &  Co., 
Chicago,  1956. 

Huey,  E.B.  The  Psychology  and  Pedogogy  of  Reading,   M:cmillan,  New  York,  1908, 
(Republished)  MIT  Press,  Cambridge,  MA,  1968. 

Jester,  R.E.  and  Travers,  R.M.W.  "Comprehension  of  Connected  Meaningful  Dis- 
course as  a  Function  of  Rate  and  Mode  of  Presentation",  The  Journal  of 
Educational  Research,   vol.  59,  1966,  pp  297-302. 

Lenneberg,  E.H.  Biological  Foundations  of  Language,   John  Wiley  &  Sons,  New 
York,  1967. 

Miller,  G.R.  and  Coleman,  E.B.  "A  Set  of  Thirty-Six  Passages  Calibrated  for 
Complexity",  Journal  of  Verbal  Learning  and  Verbal  Behavior,    vol.  6, 
1967,  pp  851-854. 

Miron,  M.S.  and  Brown,  E.  "The  Comprehension  of  Rate  Incremented  Aural  Coding", 
in  E.Foulke  (ed.)  Proceedings  of  the  Second  Louisville  Conference  on  Rate 
and/or  Frequency -Controlled  Speech,   University  of  Louisville,  Louisville, 
KY,  February  1971. 

Mowbray,  G.H.  "Simultaneous  Vision  and  Audition:  The  Comprehension  of  Prose 
Passages  With  Varying  Levels  of  Difficulty",  Journal  of  Experimental 
Psychology,   vol.  46,  1953,  pp  365-372. 

National  Assessment  of  Educational  Progress.  Reading  Rate  and  Comprehension, 
1970-71  Assessment.     Report  02-R-09,  Education  Commission  of  the  State,  Denver, 
CO,  December  1972. 

Shuy,  R.  "Some  Language  and  Cultural  Differences  in  a  Theory  of  Reading",  in 
K.Goodman  and  J.Fleming  (eds.)  Psycholinguistics  and  the  Teaching  of 
Reading,   The  International  Reading  Association,  Newark,  DE,  1969. 

Smith,  F.  Understanding  Reading,   Holt,  Rinehart,  and  Winston,  New  York,  1972. 

Sokolov,  A.N.  Inner  Speech  and  Thought,   Plenum  Press,  New  York,  1972. 

Sticht,  T.G.  "Learning  by  Listening",  in  R.O.  Freedle  and  J.B.  Carrol  (eds.) 

Comprehension  and  the  Acquisition  of  Knowledge,   V.H.  Winston  and  Sons, 
Washington,  DC,  1972. 

Sticht,  T.G.  "Some  Relationships  of  Mental  Aptitude,  Reading  Ability,  and 
Listening  Ability  Using  Normal  and  Time-Compressed  Speech",  Journal  of 
Communication,   vol.  18,  1968,  pp  243-258. 

Sticht,  T.G.  The  Acquisition  of  Literacy  by  Children  and  Adults,    Paper  pre- 
sented at  the  Second  Delaware  Symposium  on  Curriculum,  Instruction,  and 
Learning;  University  of  Delaware,  June  1975. 


216 


Sticht,  T. ,  Beck,  L. ,  Hauke,  R. ,  Kleiman,  G.,  and  James,  J.,  Auding  and 
Reading:  A  Developmental  Model,   HumRRO  Press,  Alexandria,  VA,  1974. 

Taylor,  S.E.  "An  Evaluation  of  Forty-One  Trainees  Who  Have  Recently  Com- 
pleted the  'Reading  Dynamics'  Program",  Problems,  Programs,  and  Project i 
in  College  Adult  Reading,   11th  Yearbook  of  the  National  Reading  Con- 
ference, 1962,  pp  41-56. 

Young,  R.Q.  "A  Comparison  of  Reading  and  Listening  Comprehension  With  Rate 
of  Presentation  Controlled",  AV  Communication  Review,   vol.  21,  1973, 
pp  327-336. 


217 


Correlates  of  Successful  Speech  Compression  Use  by  Blinded 

Veterans 

by  De  l'Aune,    W.  ,    Lewis,    C.  ,    Nee^ham,    W.  ,    Nelson,    J. 


219 


CORRELATES  OF  SUCCESSFUL  SPEECH  COMPRESSION  USE  BY  BLINDED  VETERANS 

W.  De  l'Aune,  Ph.D.,  C.  Lewis,  B.S.,  W.  Needham,  Ph.D.,  and  J.:.JJelson,  M.A. 
Eastern  Blind  Rehabilitation  Center  and  Psychology  Service 
Veterans  Administration  Hospital,  West  Haven,  CT.  06516 

Abstract :      Blinded  veterans  were  asked  to  listen  to  four  sections  of  a 
seventh  grade  level  biographical  sketch  which  had  been  recorded  at  progressively 
faster  rates  (1.0,  1.5,  2.0,  and  2.5  times  the  initial  rate  of  194  words  per 
minute)  through  use  of  a  commercially  available  electronic  discrete  time  compressed 
speech  device.   After  each  section  (approximately  847  words),  multiple  choice 
questions  were  asked.   The  highest  rate  at  which  607.  were  answered  correctly  was 
taken  as  the  maximum  comprehensible  compression  rate  for  that  Subject.   It  was 
found  that  86%     of  the  veterans  tested  could  meet  this  criterion  level  of 
comprehension  at  the  2.5  compression  rate.   Variables  such  as  age,  use  of 
hearing  aid,  education  level,  Wechsler  Adult  Intelligence  Scale  Verbal  IQ,  and 
scale  scores  of  the  Minnesota  Multiphasic  Personality  Inventory  and  the  Cali- 
fornia Psychological  Inventory  were  analyzed  for  possible  relationship  with 
the  Subject's  maximum  comprehended  compression  rate.   The  results  indicate  that 
younger  veterans  whose  personality  tests  indicated  better  psychological 
adjustment  tended  to  be  more  successful  in  comprehending  compressed  speech. 


Introduction: 

The  American  Foundation  for  the  Blind  estimates  that  there  are  approximately 
550,000  people  in  the  United  States  today  whose  vision  is  rated  at  less  than  20/200 
or  have  a  visual  field  of  less  than  20  .  These  people  are  considered  legally 
blind  and  their  use  of  written  material  is  either  absent  or  severely  restricted. 
To  compensate  for  this  factor,  several  alternative  ways  of  processing  this  type 
of  material  are  currently  available,  at  least  in  advanced  experimental  form. 
These  include  Braille,  modality  transformation  systems  (Optacon  and  Stereotoner) , 
speech  synthesis  systems,  and  human  readers,  either  live  or  on  tape.   Especially 
in  the  case  of  the  once  sighted  blind,  all  of  these  methods,  except  those 
utilizing  speech,  have  the  disadvantage  of  forcing  the  visually  impaired 
individual  to  process  the  information  at  rates  much  slower  than  his  sighted 
counterparts.   Even  the  methods  using  speech  produced  by  human  readers  proceed 


220 


at  normal  speech  rates  (reading  aloud)  averaging  176.5  words  per  minute  (Johnson, 
1961).   In  contrast,  the  average  sighted  reader  (reading  silently)  can  cover 
from  300  to  500  words  in  the  same  period  of  time. 

There  have  been  many  attempts  to  increase  speech  rate  by  various  means 
but  most  have  distinct  problems.   Simply  asking  the  reader  to  speak  faster 
creates  only  minimal  stable  increases  in  rate  but  unacceptable  changes  in 
inflection  (Harwood,  1955).   Changing  the  playing  speed  of  a  recording  also 
creates  problems  in  terms  of  a  frequency  shift  of  one  octave  each  time  the 
playing  rate  is  doubled.   This  change  in  pitch  causes  the  familiar  "chipmunk" 
effect  and  renders  the  recording  unintelligible  prior  to  most  viable  increases 
in  rate. 

A  previous  study  by  the  authors  (Do  l'Aune  et  al.,  1975)  demonstrated 
the  viability  of  commercially  obtainable  speech  compression  systems  when 
used  by  a  random  sample  of  blinded  veterans  enrolled  in  the  adjustment  to 
blindness  training  program  at  the  West  Haven  Veterans  Administration 
Hospital.   Correlations  were  reported  between  performance  in  the  speech 
compression  task  and  various  scale  scores  from  the  California  Psychological 
Inventory  (CPI)  and  the  Minnesota  Multiphasic  Personality  Inventory  (MMPI) 
with  veterans  exhibiting  indications  of  better  "psychological  health"  doing 
better  on  the  task.   A  significant  relationship  between  age  and  performance, 
with  younger  veterans  understanding  the  material  more  adequately,  was  also 
reported.   No  differences  between  veterans  of  varying  intellectual  levels 
as  measured  by  the  Wechsler  Adult  Intelligence  Scale  Verbal  IQ  (WAIS  IQ)  or 
educational  levels  were  found. 

Although  the  number  of  veterans  in  the  previous  study  able  to 
comprehend  the  material  at  a  rate  twice  as  fast  as  originally  recorded  was 
formidable  (77.87.)  it  was  felt  that  performance  would  be  even  higher  if 
testing  was  restricted  to  those  veterans  expressing  an  active  interest  in 
speech  compression  devices  because  of  vocational  or  avocational  reading 
needs.   Further  investigation  of  the  personality,  educational,  and 
intellectual  variables  was  also  desired  for  this  group. 


Method: 

To  investigate  blinded  veterans1  comprehension  of  discrete  electronic 


221 


time-compressed  speech  and  its  relationship  to  the  aforementioned  variables 
it  was  first  necessary  to  develop  a  test  to  quantitatively  assess  performance. 
Dr.  Emerson  Foulke  at  the  Center  for  Rate  Controlled  Recordings  at  the 
University  of  Louisville  supplied  the  experimenters  with  a  professionally 
recorded  biographical  sketch  of  Mary  Bethune  and  one  of  Katherine  Duhnam. 
The  transcript  of  each  test  was  divided,  to  the  nearest  paragraph,  into  four 
approximately  equal  sections.   The  tapes  were  then  divided  in  accordance 
with  the  transcripts  and  rerecorded  from  a  Varispeech  unit,  with  each  section 
played  at  a  progressively  faster  rate.   The  consecutive  sections  were  recorded 
at  rates  of  1.0  (about  190  words  per  minute  (wpm)),  1.5  (about  285  wpm) , 
2.0  (about  380  wpm),  and  2.5  (475  wpm).   Each  of  the  four  sections  of  each 
test  were  assigned  five  multiple  choice  questions  which  were  excerpted  from 
that  particular  section  of  the  tests.  All  questions  were  pretested  at  normal 
rates  prior  to  the  actual  experiment.   The  two  tests  were  randomly  alternated 
throughout  the  testing  with  no  significant  differences  in  performance. 

The  subjects  were  members  of  the  patient  population  of  the  Veterans 
Administration's  Eastern  Blind  Rehabilitation  Center  in  Uest  Haven,  Connecticut, 
who  had  expressed  in  interest  in  speech  compression  systems  for  either 
vocational  or  avocational  reading  purposes.   All  subjects  were  legally  blind 
and  varied  in  employment  experience,  educational  level,  and  age. 

The  subjects  were  told  that  the  test  they  were  about  to  take  was  not 
an  indication  of  their  ability  but  was  simply  a  measure  of  the  technique  of 
speech  compression  in  an  effort  to  alleviate  any  tension  which  could 
accompany  performance.   However,  to  impress  them  with  the  seriousness  of  the 
task,  they  were  also  told  to  do  their  best  since  their  performance  could 
influence  the  Veterans  Administration's  future  decisions  to  issue  such  devices 
to  blinded  veterans.   Subjects  were  then  instructed  to  listen  carefully  to 
each  of  the  four  sections  (in  ascending  compression  rate)  and  told  they 
would  be  expected  to  answere  five  multiple  chioie  questions  immediately 
after  each  section.   Each  qoestion  was  repeated  twice. 

Comprehension  rate  was  determined  by  scoring  the  number  of  correct 

answers  per  section.   Any  subject  having  three  answers  out  of  five  correct 

was  deemed  to  have  satisfactorily  comprehended  that  section.   The  highest 
speech  rate  comprehended  according  to  this  criterion  was  then  recorded  as 


222 


TABLE  1. 

1.0      1.5 

2.0 

2.5 

89.61    87.91 

81.86 

67.60 

18.22    19.59 

24.30 

27.97 

the  subject's  Maximum  Comprehended  Compression  Rate  (MCCR)  score.  In  addition 
to  this  score  information  about  the  veteran's  age,  educational  level,  WAIS  IQ, 
MMPI,  and  CPI  scores  was  obtained,  if  possible. 

The  data  were  then  subjected  to  statistical  analysis  utilizing  descriptive 
statistics,  independent  t-tests,  and  linear  regressions  whenever  appropriate. 

Results : 

It  was  found  that  867.  of  the  subjects  were  able  to  meet  criterion  levels 
of  comprehension  at  the  2.5  compression  rate  (n=99).   If  the  subjects  from 
the  previous  study  are  includec/,  the  mean  percent  correct  scores  and  standard 
deviations  for  each  presentation  rate  are  those  shown  in  table  1. 


Compression  Rate 
Mean  Percent  Correct 
Standard  Deviation 


Linear  regressions  of.  MCCR  scores  and  the  available  (n=68)  veterans'  scores 
from  the  CPI  revealed  statistically  significant  (p^.05)  positive  correlations 
between  MCCR  and  Dominance  (Do),  Tolerance  (To),  Psychological  mindedness  (Py), 
and  Flexibility  (Fx).   It  should  be  noted  that  all  of  the  18  CPI  scales  were 
positively  correlated  with  performance  (see  TABLE  2). 


TABLE  2. 

Linear  Regressions  between  CPI 
Scales  and  MCCR  scores. 

r 

Do  (Dominance)  o.295* 

Cs  (Capacity  for  Status)  0.131 

Sy  (Sociability)  0.159 

Sp  (Social  Presence)  0.192 

Sa  (Self-acceptance)  0.166 

Wb  (Well-being)  0.066 

Re  (Responsibility)  0.194 

So  (Socialization)  0.084 

Sc  (Self-control)  0.070 

To  (Tolerance)  0.250* 

Gi  (Good  impression)  0.164 

Cm  ( Communal ity)  0.140  223 


<D  <M 

■*  cm 

II    ii 

c  c 

in  in 

c\i  c\j 

II    V 

<r  rr 

OO                                   m  -r- 

O  O                                   °  o 

25                                 oo 

i 

VI  VI 

! 

a  a. 

1 
1 

Standard  Scores 

QOOOOOOO 

0         0 

01 

01.                 CD                   N                   <P                    IO                  *                   10                   «l 

i           i           i           i           i           i           i           i           i                      i           i           i           i           i           i 

£ 

s            s 

1    1  1    'I    1  1 

, ,  f!  ,,,?,,  , 

in 

o 
11      1    1 

( 

0 

• 

M 

(0 

0> 

t_ 

a 
E 
o 
o 

M 

lb 

o 

1   '  7 

o 

...!:.. 

o 

"7 

vv 

/  / 

m 

„      ,     , 

.    1 

HI 

0 

M 

y.          1 

U 
0 

£ 

i   i    7 

,    i    ii    ]  i 

,    ,     7  7-/ 

.      .       .    ||        i.i 

1 

■   1 

£  ** 

1 

0) 

a. 

• 

, ,s, ,     ,? 

o 

m 
i    i    ,    ij7 

8              8 
i    ....mi    i  iii  . 

,  , 

0 

7 .  .  . 

•    *| 

■o 

'               /       1 

c 

|              / 

2 

< 

, 

sis          s 
i  i  >ti  i  i  iii  i  i 

0                 1    m 

1    ° 

<    * 

0 

■o 

/     y 

c 

,s , ,      8  , 

/   m     /|              o 

0 

3 

4f 

■  i 

/.  7*  ■  i  •  7  ' 

■  ii  7  •  '  • 

•  .  .  1  1 

<       1 

o 

i 

z 

1 

■o 

E 

S 

0 

2 

$    *\ 

3 

o 

o 

1 

'   \'     I1  \  '      ]       '     ' 

1    • 

.    1 

O 

v  '    \ 

•a 
c 

(0 

o 

o 

II  1  II 

r ,  , 

m            \o      \       *>    |             o 

■  M   u 

0 

5       1 

2 

«/> 

|x\Nv 

3 

o 

e 

z 

8 

S      1                 j\     2 

.    i  |  i  .  ,  ,  7>i 

0 

n 

0 

1 

fi    *\ 

E 

o 

1- 

a 

' 

0 

o 

as 

II     /     > 

Z 

(0 

O             f             o 

m          o  /     .*> 

...  ,,,  .  (.<T,,, 

o 

m          0 

n 

0 

• 

J* 

Idk9 

_i 

«          «          * 

I,  .7.  ...7..  ..7 ... 

pi    1 

•  i 

*       1 

0 
2 

c 

« 

\  \ 

I 

a 

z 

I  I2  ,  I  I  ,s  I  I 

o 

«  \ 

0              <f>              0 

f, 

0 

. 

a! 

A 

I 

» 

1    i|    1    1    11    |    1 

J       1 

■  7>V  • 

|i    1    1    1    ,1    1 

> 

!  \ 

1 

o 

• 

c 
1  w 

S 

ol         \    g 
|  a    i    i  \    |    i    i    i 

O                |  5 

1  2 

£     *| 

is 
o 

1 

1 

1 

o 

f, 

0 

in 

0 

■• 

a 

.11 

» 

'/"/ 

1      1    1      1 

1  .  .   .,7 

1     1 

*    1 

o 

c 
a> 

> 
c 

A 

.   ,  . 

8          1    s 
1    .  '    III    I 

■T 

,/,   ,|7  .   .. 

0 

'   1 

1  1  1  1 

*   1 

\ 

\ 

"5 
o 

a, 

s         s          ?   -  ,* 

«\ 

i  i  i\  i  1 1  i  i  i 

0     1 

5              9 

*  1 

o> 

i 

o 

/  I 

( 

o 
.c 
o 

jr 

Ifl 

1 7 

8 

i  i  i  i  i 

i  .  .  .(i 

.    /.  S Ii    .    .    .  " 

0 

,  , 

1    .    1 

0 

*  1 

> 

(A 

\ 

\ 

a. 

.2 

c 

u 

1 

8 

m                         o 

>   .  \|7   ■   ■  . 

0 

. 

<     1 

,  .  , 

s    ^ 

1  & 

§1 

<S 

in 

?     Is     la/ 

*/            1        N 

2            1    2 

1 7  .  .  >  1 "  1 

1 

0  ** 

I 

2  * 

gg222?S8 

O               O 

3  £ 

iflJODC     piOpuDfC, 

224 


Ac  (Achiev.  Conform.)  0.130 

Ai  (Achiev.  Indep.)  0.202 

Ie  (Intell.  Efficiency)  0.193 

Py  (Psychol,  mindedness)  0.356** 

Fx  (Flexibility)  0.280* 

Fe  (Femininity)  0.021 


When  the  sample  was  divided  into  those  subjects  not  able  to  meet  criteria  at 
the  2.5  rate  (n=22)  and  those  subjects  successfully  meeting  criteria  at  this 
level  (n=^6)  with  t-tests  used  to  determine  if  the  CPI  scale  6cores  were 
different  for  the  two  groups,  it  was  found  that  the  scales  of  Dominance, 
Capacity  for  Status,  Sociability,  Social  Presence,  Reliability,  Tolerance, 
Communality,  Achievement  via  Independance ,  Intellectual  efficiency,  and 
Psychological  mindedness  had  significantly  different  means  for  the  two 
groups.   This  relationship  is  shown  in  Figure  1. 


Linear  regression  of  MCCR  and  the  available  (n=63)  veterans'  scores  from 
the  MMPI  revealed  a  significant  positive  relationship  with  the  K  scale  and  a 
significant  negative  relationship  with  the  Depression  (D)  scale.   Eight  of  the 
ten  clinical  scales  of  the  MMPI  were  negatively  correlated  with  performance 
(see  Table  3). 

TABLE  3. 

Linear  Regressions  between 
MMPI  Scales  and  MCCR  scores, 
r 
L  (Lie)  o.ol7 

F  (Frequency)  -0.123 

K  (Correction)  0.289* 

Hs  (Hypochondriasis)  -0.164 

D  (Depression)  -0.371** 

Hy  (Hysteria)  -0.060 

Pd  (Psychopathic  deviate)  -0.091 

Mf  (Masculinity/femininity)  -0.024 

Pa  (Paranoia)  -0.201 

Pt(Psychoasthenia)  -0.011  225 


Sc  (Schizophrenia)  0.040 

Ma  (Hypomania)  0.097 

Si  (Social  introversion)  -0.250 


When  this  group  was  divided  by  the  performance  criterion  level  (high,  n=li3> 
low,  n=20),  it  was  found  that  the  K  scale,  Depression,  and  Social  introversion 
were  significantly  different.   Figure  2  shows  this  relationship. 


A  significant  negative  relationship  was  found  to  exist  between  age  and 
MCCR  (r=-0.280*,  n=84).   No  significant  relationships  were  found  between  MCCR 
and  WAIS  verbal  IQ  (n=65)  or  educational  level  (n=79). 

Discussion: 

On  the  basis  of  the  data  analyzed  in  the  preceeding  section  of  this  paper, 
several  series  of  conclusions  can  be  drawn  both  in  regard  to  the  overall  level 
of  performance  and  certain  factors  influencing  the  performance  of  blinded  veterans 
in  the  comprehension  of  discrete  time-compressed  speech. 

The  first  of  these  concerns  the  percentages  of  blinded  veterans  capable 
of  comprehending  recorded  materials  at  rates  faster  than  presently  available 
with  standard  sound  reproduction  equipment.   The  finding  that  867.  of  the 
veterans  expressing  vocational  or  avocational  needs  for  speech  compression 
were  able  to  comprehend  material  presented  at  rates  approaching  475  words 
per  minute,  indicates  that  these  devices  are  of  very  broad  applicability  and 
may  serve  a  viable  role  in  providing  veterans  with  a  comprehensible,  time 
saving  means  of  aural  reproduction  of  verbal  materials. 

The  second  conclusions  concern  the  possible  indicators  of  performance  in 
the  speech  compression  task.   Subjects  who  were  younger  tended  to  do  better  in 
this  experiment.   WAIS  IQ  and  educational  level  were  not  linked  in  any  fashion 
to  performance  in  this  task.   These  results  also  provide  hope  for  wide  applica- 
bility of  the  devices. 

226 


Figure  2.  Minnesota  Multiphasic  Personality  Inventory  Profiles  of  Blinded  Veterans  Who 

Could  and  Could  Not  Understand  Speech  Compressed  at  the  2.5  Rate 

Srnror'u    Initials! 

1                 234567890                              For  Recordmg 

TorTc         ?            L             F            K       Hs  +  .5K        D           Hy     Pd  +  .4K       Ml          Pa       Pi  *1K    Sc-MK  Ma-.2K       Si       To.Tc    Add,t,onal  Scales 

1 20- 

-120 
45-              -                                                                              '_ 

"1  Male 

-"              -                                        SS~              -                                     -115 

-        -         -        "              M-        :     55-                      : 

35—            -        50-        45- 

110— 

-                          -                                              —110 
50—            -                                   40- 

105  t 

"        40"             -"            "             "             -        50-             -             "                   -105 
45-           -            -           -                    50" 

100-f 

-        40-             -       25-                                                           -100 
30-                                                45-                                                               ,„       : 

95  ■; 

:      35_      to-  : 

:  35-  40:     :     -     -   «:  «-   3S.    j.-- 

90  •" 

:  I-90 
-         .         .      35-     40-         -         .         -         -      Ki  ; 

85- 

25-                     oc                                     20-        40-                                         -   -  85 

;                                                30- 

35-                            -                                     40  -        ,r                   -    - 

-     30-        -        -        -        -        "        -     30:      55-  : 

80—      130- 

80 

15- 

35  -                                          -                        .„  -   : 

:      120- 

"      ,„-      30-         -          -          "          -                 5°-  : 

75  -                                                25- 

-        30-            _             _                     35-       35_             -              ;   r75 

:     110- 

20-            "            "                        «-  : 

70  -    iflfl-     10         ;       " 

"    J^\                       -        30-       lj                                       -'J  _              -_-_    7Q 
-        25-                          -                                  30-             -              :   -  65 

;        80-                               -         20- 

V                  ^-^\                   30-                                          •    - 

60^       7°-                       A          : 

X     .^^^Z^\j             -   ^^W         351-60 

60-                     /    \ 

15           20^"            -                     K          \ N *Ss  "        20^V         -   : 

55^       5°-                  A,\S- 
40-          b~/''     '    x\  " 

-      20-         -         -         _         -         -    \    :  : 

-                                                       * 

20—                                                               2\'I"-    M 

10—            "        15-                          -             -             "             "                           v   - 

10- 

15-                                                                   -20  — 

«-                            „_ 

:                                 _      20-                 15_       2°"  ~45 

40  — 

:     :     :   1S:   15:   5~                   .*■!=-« 

10-                     .          -          -          -          ■ 

0-                       5- 

1 5  —                              -    - 

35  - 

5-        ,0-                          -             :             I        .5-         5-        ,0;         10-:r35 

:                                                  0- 

-      °-         -      io-          -           :  : 

25  ■; 

-   -  25 
10- 
0—                                                                                                                5—                   - 

20  — 

-20 

0-                                           ♦ 

*                                                                                                         *      -0 

TorTc         1             L             F            K       Hs-.5K        D           Hy      Pd-.4K      Mf         Pa         Pt-IK   Sc*IK  Ma-.2X        S,        TorTc 

I             23456789            0 

MCCR  >  2.5  (n  =  43)                                   -p  s  0.05 

MCCR  <  2.5  (n  =  20)                                 "p  £  0.01 

227 


The  scale  scores  of  the  two  personality  tests  were  much  better  predictors 
of  performance  on  the  speech  compression  task.  The  CPI  personality  scales 
which  were  significantly  related  to  successful  comprehension  of  compressed 
speech  include  those  that  reflect  positive  socialization,  maturity  and 
responsibility,  achievement  potential,  a  sensitivity  to  psychological 
variables  and  cognitive-behavioral  adaptability.   The  MMPI  K  scale  is  a 
correction  factor  and,  although  used  primarily  as  a  validity  scale,  high 
scores  on  it  are  at  times  associated  with  emotional  health.   Depression, 
as  most  of  the  other  MMPI  clinical  scales,  indicates  emotional  pathology 
with  high  scores.   Traits  associated  with  successful  emotional  and  behavioral 
adjustment  in  general  thus  seem  to  covary  v/ith  successful  speech  comprehension. 
This  conclusion  is  also  supported  by  the  fact  that  all  of  the  CPI  scales, 
which  measure  favorable  characteristics  in  general,  are  positively  related 
(although  only  some  of  them  were  significantly  so  On  an  individual  basis)  to 
successful  speech  comprehension. 

The  finding  that  presence  or  absence  of  "psychological  health"  is 
associated  with  positive  sensory  and  intellectual  characteristics  in  the 
Mind  (despite  a  lackii  of  relationship  with  WAIS  IQ)  is  consistent  with 
other  studies.   Psychological  state  has  been  found  to  be  related  to  ability 
to  detect  open  and  closed  spaces  by  ambient  sound  (De  l'Aune,  et  al.,  1974), 
mobility  variables  such  as  velocity  and  veer  (De  l'Aune  et  al.,  1975aj  Needham 
et  al.,  1975)  and  acquisition  of  Optacon  skill  (Gadbaw  and  De  l'Aune,  1974). 
Accordingly,  in  the  blind,  or  at  least  blinded  veterans,  there  appears  to 
be  a  global  trait  which  could  be  termed  "functional"  or  "dysfunctional 
overflow,"  whereby  psychological  adjustment  relates  to  a   vide  variety  of 
abilities  and  skills. 


Bibliography 

De  l'Aune,  W. ,  Needham,  W.  ,  Lewis,  C. ,  and  Nelson,  J.   Speech  compression 

and  blinded  veterans.  Proceedings  of  Devices  and  Systems  for  the  Disabled, 
Krusen  Center  for  Research  and  Engineering  at  Moss  Rehabilitation  Hospital, 
Philadelphia,  1975,  69-75. 

De  l'Aune,  W. ,  Needham,  "J.,  and  Kevorkian,  G.   Relationships  between  indices  of 
mobility  and  personality  factors  in  the  blind:   I.   Minnesota  Multiphasic 
Personality  Inventory.   Journal  of  the  International  Research  Cora. mnicat  ion; 
Service,  3,  80,  1975 


228 


De  l'Aune,  T.7. ,  Scheel,  P.,  Needham,  W.  ,  and  Kevorkian,  G.   Evaluation  of  a 
methodology  for  training  indoor  acoustic  environmental  analysis  in 
blinded  veterans.   Proceedings  of  the  1974  Conference  on  Engineering 
Devices  in  Rehabilitation,  Tufts-New  England  Med  School ,  Boston,  "1*974, 
26-31. 

Gadbaw,  P.  and  De  l'Aune,  W.   Correlates  of  successful  Optacon  learning. 

Presented  at  Experienced  Optacon  Teacher's  Seminar,  Palo  Alto,  California, 
November,  1974. 

Ilarwood,  K.   Listenability  and  rate  of  presentation,  Speech  Monographs , 
22,  1955,  57-59. 

Johnson,  W.   Measurement  of  oral  reading  and  speaking  and  disfluency  of 

adult  male  and  female  stutterers  and  non  stutterers.   Journal  of  Speech 
and  Hearing  Disorders  Monogram  Suppl em ent ,  No .  7 . ,  1961. 

Needham,  W. ,  De  l'Aune,  W. ,  and  Kevorkian,  G.   Relationships  between  indices  of 
mobility  and  personality  factors  in  the  blind:   II.   California  Personality 
Inventory.   Journal  of  the  International  Research  Communications  Service, 
3,  81,  1975. 


229 


A  Study  of  the  Instructional  Potential  of  Compressed  Speech  for 
Postsecondary  Technical  Students 
by    Wood,    D.L. 


Dr.  Wood,   Special  Teacher  of  Reading   (K-12) ,    is  an  Assistant 
Professor  in  the  Associate  of  Applied   Science-Related   Studies  Depart- 
ment of   the   State  Technical   Institute  at  Memphis. 


231 


ABSTRACT 

A  STUDY  OF  THE  INSTRUCTIONAL  POTENTIAL  OF  COMPRESSED  SPEECH 
FOR  POSTSECONDARY  TECHNICAL  STUDENTS 

The  purpose  of  this  study  was  to  examine  the  relationship  between 
rate  controlled  speech  and  reading  comprehension  skills  of  postsecondary 
technical  students.   The  particular  comprehension  skills  of  interest 
were  (1)  the  ability  to  detect  main  ideas,  (2)  the  ability  to  determine 
supporting  details,  and  (3)  the  ability  to  draw  conclusions.   These 
skills  were  assessed  for  both  general  and  study  listening  materials  pre- 
sented at  varying  accelerated  speech  rates.   Specifically,  the  investi- 
gator hypothesized  that  compression  rates  ranging  from  205  (15)  310 
words  per  minute  would  lead  to  no  difference  in  the  comprehension  of 
main  ideas,  details  and  conclusions  for  any  of  three  classes  of  readers. 

Two  hundred  and  eight  male  technical  students  from  a  Mid-South  tech- 
nical school  were  subjects  for  the  study.   On  the  basis  of  their  per- 
formance on  the  Diagnostic  Reading  Test,  Survey  Section,  Upper  Level, 
Form  B  the  students  were  classified  into  one  of  three  reading  ability 
groups:   below  average,  average  and  above  average  readers.   The  subjects 
were  randomly  assigned  to  listen  to  the  comprehension  selections  of  the 
Diagnostic  Reading  Test,  Survey  Section,  Upper  Level,  Form  F  which  had 
been  recorded  by  a  male  professional  reader  from  the  Perceptual  Alterna- 
tives Laboratories  in  Louisville,  Kentucky.   The  specific  rates  of  the 
recorded  selections  were  205,  220,  235,  250,  265,  280,  295  and  310  words 
per  minute. 


232 


A  3  x  8  treatments  by  levels  analysis  of  variance  was  used  to 
test  for  significant  differences  among  the  means. 
The  major  findings  of  this  study  were: 

1.  There  was  a  significant  difference  in  the  listening  ability  of  the 
above  average,  average  and  below  average  readers  to  comprehend  main 
ideas,  details  and  conclusions  in  general  and  study  materials. 

2.  There  was  a  significant  difference  in  the  listening  ability  of  the 
three  types  of  readers  to  comprehend  details  in  general  and  study 
materials  at  the  accelerated  speech  rates  205  (15)  310  words  per 
minute. 


233 


A  STUDY  OF  THE  INSTRUCTIONAL  POTENTIAL  OF  COMPRESSED  SPEECH 
FOR  POSTSECONDARY  TECHNICAL  STUDENTS 
By  Donna  Wood 

Reading  and  listening  are  basic  tools  of  communication  and 
learning.   The  importance  of  reading  has  never  been  questioned  while  the 
significance  of  listening  is  receiving  increased  attention  (Duker,  1971). 
Listening  is  acknowledged  as  an  important  skill  at  every  developmental 
level  from  nursery  school  through  college  and  is  accepted  as  a  vital 
component  of  the  learning  process. 

The  relationship  between  learning  by  reading  and  learning  by 
listening  was  recognized  in  the  first  professional  book  written  on  the 
teaching  of  reading  (Huey,  1908) .   Since  1908,  a  number  of  research 
studies  have  revealed  evidence  of  a  correlation  between  reading  and  lis- 
tening. A  review  of  research  studies  comparing  the  efficiency  of  learn- 
ing by  reading  and  learning  by  listening  supports  the  concept  that 
reading  and  listening  have  been  equally  efficient  in  the  acquisition  of 
knowledge.   Approximately  one-third  of  the  studies  reported  a  slight 
advantage  to  reading,  one-third  a  slight  advantage  to  listening,  and 
one-third  found  differences  to  be  non-significant  (Nichols,  1951). 

Orr  (Orr,  1964)  has  postulated  that  the  speed  of  processing 
incoming  information  may  be  directly  correlated  to  reading  and  listening 
rates  acquired  by  an  individual.   Because  most  individuals  seldom 
encounter  connected  discourse  at  speeds  greater  than  normal  speaking 


234 


rates  or  have  occasion  to  process  information  more  rapidly  than  required 
by  their  normal  reading  speeds,  the  logical  consequence  is  a  relatively 
slow  thought  rate  conditioning.  Given  the  opportunity  to  listen  to 
speeded  discourse,  what  variations  in  comprehension  would  occur  for 
individuals  with  differing  reading  abilities?  This  core  question  forms 
the  crux  of  the  present  study. 

The  purpose  of  this  experimental  study  was  to  determine  the 
applicability  and  utility  of  rate  controlled  speech  and  its  relationship 
to  reading  comprehension  skills  in  the  technical  school  setting.  A 
review  of  related  literature  revealed  a  dearth  of  studies  regarding  the 
use  of  compressed  speech  with  technical  students. 

Three  specific  comprehension  skills  were  of  interest  and  were 
made  the  focus  of  the  data  analysis.   These  skills,  determined  by  a  con- 
tent analysis  of  the  test  items,  were:   the  ability  to  detect  main  ideas, 
the  ability  to  note  supporting  details  and  the  ability  to  draw  appro- 
priate conclusions.  Responses  for  these  comprehension  skills  were 
obtained  for  general  as  well  as  study  materials.  General  material  was 
defined  as  story-type  passages  with  a  relatively  easy  vocabulary  while 
study  material  was  defined  as  material  associated  with  social  studies 
and  science  containing  a  somewhat  specialized  vocabulary  and  specific 
factual  concepts . 

Two  principal  hypotheses  were  tested:   (1)  it  was  hypothesized 
that  increased  compressed  speech  rates  would  have  no  effect  on  listening 
comprehension  of  main  ideas,  details  or  conclusions;  and  (2)  it  was 
hypothesized  that  listening  comprehension  of  main  ideas,  details  and 


235 


conclusions  would  not  differ  according  to  the  reading  ability  of  the 
subjects . 

The  subjects  for  the  study  were  208  male  technical  students  from 
the  State  Technical  Institute  at  Memphis.   This  Institute  offers  the 
Associate  Degree  in  24  technological  areas;  among  these  are  11  engineer- 
ing technologies,  8  business  and  science  technologies  and  5  applied 
science  areas  which  are  application-oriented  programs.  A  majority  of 
the  subjects  participating  in  the  study  were  from  the  Reading  and  Study 
Improvement  classes.   The  subjects  ranged  in  age  from  17  to  62  years  and 
were  by  race  71  percent  Caucasian,  28  percent  Negro  and  1  percent  Oriental. 
Seventy-five  percent  of  the  subjects  were  enrolled  in  accounting,  mid- 
management,  engineering  or  data  processing  technologies.  Ninety  percent 
of  the  subjects  had  no  prior  knowledge  of,  or  encounter  with,  time  com- 
pressed speech.   Students  with  previous  exposure  to  time  compressed 
speech  reported  having  encountered  it  in  military  service. 

The  Diagnostic  Reading  Test,  Survey  Section,  Upper  Level,  Form  B, 
(Triggs,  Mountain  Home,  North  Carolina)  was  administered  to  each  subject. 
On  the  basis  of  his  performance,  the  subject  was  placed  in  one  of  three 
reader  categories:  below  average  reader,  average  reader  or  above  aver- 
age reader.   The  below  average  readers  scored  in  the  1-33  percentile 
range,  the  average  readers  scored  in  the  34-66  percentile  range  and  the 
above  average  readers  scored  in  the  67-99  percentile  range.   There  were 
80  below  average  readers,  80  average  readers  and  48  above  average 
readers . 

The  rate  controlled  speech  recordings  were  prepared  at  the 


236 


Perceptual  Alternatives  Laboratory  at  Louisville,  Kentucky.  A  male 
professional  reader  was  used  to  record  the  comprehension  selections  from 
Form  F  of  the  Diagnostic  Reading  Test.   Form  F  was  chosen  because  the 
principal  selections  contained  information  on  the  praying  mantis,  a 
topic  about  which  few  subjects  would  have  factual  information.   The  spe- 
cific compression  rates  selected  were  chosen  to  provide  increments  large 
enough  to  be  readily  perceived  as  increases  in  rate  yet  small  enough  to 
detect  a  rate  at  which  changes  in  comprehension  occurred.   The  rates 
chosen  were  205,  220,  235,  250,  265,  280,  295  and  310  words  per  minute. 
Since  90  percent  of  the  subjects  had  no  prior  experience  with  time  com- 
pressed speech,  a  10  minute  practice  tape  was  used.   The  practice  tape 
consisted  of  a  series  of  recorded  segments,  increasing  in  speed  with 
increments  of  35  words  per  minute  and  ranging  from  210  to  385  words  per 
minute.   Each  subject  was  then  randomly  assigned  to  listen  to  a  compressed 
speech  tape  at  one  of  the  eight  rates  selected  for  the  study. 

A  3  x  8  treatments  by  levels  analysis  of  variance  was  performed 
to  analyze  the  data.   The  hypotheses  related  to  main  ideas,  details  and 
conclusions  in  general  and  study  materials  were  examined  separately.  The 
level  of  significance  used  was  .05,  however,  the  .01  level  was  reported 
where  it  existed.   The  Scheffe  post  hoc  procedure  was  used  to  test  for 
significance  among  the  means. - 

The  obtained  mean  square  ratios  for  Main  Effect  A  (Words  Per 
Minute)  in  the  ANOVA  were  not  significant  for  main  ideas  and  conclusions 
in  general  or  study  materials;  however,  the  mean  square  ratios  were 
significant  for  details  at  the  .01  level  for  both  general  and  study 


237 


material.  The  Scheffe  analysis  revealed  no  significant  differences 
among  the  means .   This  seemingly  paradoxical  phenomenon  was  also  encoun- 
tered by  Myers  (Myers,  1973)  who  stated  that  there  is  no  guarantee  that 
such  differences  will  be  significant  when  the  over-all  F  test  is  signi- 
ficant . 

Main  Effect  B  (Reader  Type)  in  the  ANOVA  was  significant  at  the 
.01  level  for  listening  comprehension  of  main  ideas,  details  and  con- 
clusions.  The  Scheffe  analysis  of  the  means  revealed  that  there  was  a 
significant  difference  between  the  scores  of  the  above  average  and 
below  average  readers  but  no  significant  difference  existed  between 
the  above  average  and  average  readers.   This  held  true  in  all  cases 
except  for  details  in  study  material  and  conclusions  in  general  material 
where  all  groups  showed  significant  differences. 

In  a  comprehensive  sense,  there  were  no  significant  interaction 
effects  between  words  per  minute  and  type  of  reader  for  any  of  the  com- 
prehension skills.  However,  a  careful  look  at  the  matrix  of  cell  means 
suggested  the  presence  of  "local"  interaction  effects  in  these  data. 
Lindquist  (Lindquist,  1953)  explained  similar  situations  by  stating  that 
the  F-test  of  interaction  for  the  ANOVA  table  as  a  whole  is  not  very 
sensitive  to  an  interaction  affecting  only  a  small  part  of  the  entire 
table. 

The  word  per  minute  performances  fluctuated  across  the  accel- 
erated rates  for  all  types  of  readers.   The  level  of  comprehension  of 
main  ideas  in  general  material  for  below  average  and  average  readers 
tended  to  decrease  at  295  words  per  minute,  while  the  level  of 


238 


comprehension  of  main  ideas  in  general  material  for  the  above  average 
readers  increased  through  310  words  per  minute.   In  comprehending  main 
ideas  in  study  material,  the  below  average  readers'  and  the  above  average 
readers'  maximum  comprehension  performances  occurred  at  235  words  per 
minute.   In  comparison,  the  best  performance  for  comprehension  of  main 
ideas  in  study  material  by  average  readers  occurred  at  280  words  per 
minute . 

In  comprehending  details  in  general  material,  the  below  average 
and  average  readers  demonstrated  a  greater  fluctuation  in  their  per- 
formance than  did  the  above  average  readers.  The  below  average  readers' 
highest  comprehension  performance  was  at  220  and  265  words  per  minute. 
These  readers  showed  progressively  diminishing  scores  at  280,  295  and 
310  words  per  minute.   The  average  readers  performed  best  at  the  lower 
rates  of  205,  220  and  235  words  per  minute.  The  above  average  readers 
demonstrated  a  peak  performance  at  235  words  per  minute  and  with  each 
subsequent  increase  in  word  rate,  their  comprehension  scores  for  details 
in  general  material  decreased. 

The  comprehension  scores  for  noting  details  in  study  material 
varied  appreciably  across  the  accelerated  rates  for  all  types  of  readers. 
The  below  average  readers'  performance  fluctuated  until  reaching  a  peak 
at  280  words  per  minute.  The  average  readers'  best  performance  was  at 
205  words  per  minute  and  varied  somewhat  until  reaching  a  peak  at  265 
words  per  minute.  Beyond  that  point,  there  was  a  consistent  lowering 
of  scores  at  each  higher  word  rate.  The  above  average  readers  demon- 
strated more  consistency  in  their  performance  with  their  top  scores 


239 


occurring  at  220  and  235  words  per  minute.  A  slight  fluctuation  followed 
with  the  remaining  rates  until  a  plateau  was  reached  at  265  and  280  words 
per  minute. 

The  mean  comprehension  scores  for  conclusions  in  general  material 
of  the  above  average  readers  showed  less  variability  than  did  the  scores 
for  the  below  average  and  average  readers.   The  below  average  readers1 
best  performance  occurred  at  280  words  per  minute  followed  by  progressive 
decreases  at  295  and  310  words  per  minute.   The  average  readers'  mean 
scores  began  a  progressive  decline  at  265  words  per  minute.   The  above 
average  readers  performed  better  at  220  and  235  words  per  minute  and 
showed  a  steady,  consistent  performance  throughout  the  range  of  increasing 
rates. 

Surprisingly,  for  comprehension  of  conclusions  in  study  material, 
the  below  average  readers  tended  to  perform  better  as  the  word  rate 
increased.   The  average  readers'  highest  performances  were  recorded  at  205 
and  280  words  per  minute.  The  above  average  readers'  peak  score  occurred 
at  250  words  per  minute.   For  average  and  above  average  readers,  decreases 
in  comprehension  scores  occurred  at  the  highest  rates,  295  and  310  words 
per  minute. 

Results  of  the  questionnaire  relating  to  personal  information  and 
feelings  toward  rate  controlled  recordings  indicated  that  58  percent  of  the 
subjects  in  this  study  would  use  rate  controlled  recordings  if  they  were 
available.   Twenty-seven  percent  of  the  subjects  responded  negatively  to  this 
option  and  15  percent  were  undecided.   The  most  preferred  rates  were  from 
200-250  words  per  minute  while  the  least  desirable  rates  were  those 


240 


in  the  range  approaching  300  words  per  minute.  Review  material  rather 
than  new,  unfamiliar  and  highly  technical  materials  were  designated  as 
being  most  suitable  for  compression. 

In  general,  the  over-all  trend  of  comprehension  scores  for  the 
subjects  in  this  study  supports  the  theory  in  the  existing  literature 
that  comprehension  of  connected  discourse  tends  to  diminish  at  accel- 
erated rates  between  275-300  words  per  minute.  Moreover,  this  study 
supports  the  contention  that  students  with  relatively  high  intelligence, 
educational  attainment  or  general  cognitive  ability  are  more  likely  to 
perform  well  when  subjected  to  compressed  messages. 

As  a  result  of  this  study,  and  so  far  as  the  practicality  and 
utility  of  time  compressed  speech  with  technical  students  are  concerned, 
the  author  is  convinced  that  the  use  of  rate  controlled  recordings  should 
be  offered  as  an  alternative  or  an  option  for  those  students  who  prefer 
to  learn  through  the  auditory  mode  and  for  those  who  demonstrate  average 
or  above  average  reading  skills.   The  wide  range  of  rate  choices  suggest 
the  need  for  separate  units  with  the  capacity  for  individual  rate  choice. 
Furthermore,  the  type  of  material  used  at  accelerated  rates  should  be 
carefully  chosen  keeping  in  mind  that  comprehension  of  main  ideas  and 
conclusions  of  recorded  material  are  more  effectively  conveyed  at  accel- 
erated rates  than  are  details.  For  technical  school  students,  materials 
of  a  review  nature  seem  more  suitable  for  compression  than  unfamiliar  or 
highly  technical  materials. 


241 


BIBLIOGRAPHY 

Sam  Duker,  "Listening  and  Reading,"  Lis tening :   Read ings , 
compiled  by  Sam  Duker  (New  York:   The  Scarecrow  Press,  Inc.,  1971), 
p.  68. 

2 
Edmund  Burke  Huey,  The  Psychology  and  Pedagogy  of  Reading 

(New  York:   Macmillan,  1908),  p.  123. 

3 
Ralph  G.  Nichols,  "Needed  Research  in  Listening  Communication," 

Journal  of  Communication  I  (May,  1951),  p.  48. 

David  B.  Orr,  "Note  on  Thought  Rate  as  a  Function  of  Reading 
and  Listening  Rates,"  Perceptual  and  Motor  Skills  XIX  (December,  1964), 
p.  874. 

The  Diagnostic  Reading  Tests  Mountain  Home,  North  Carolina: 
The  Committee  on  Diagnostic  Reading  Tests,  Inc. 

Jerome  Myers ,  Fundamentals  of  Experimental  Design,  Second 
Edition  (Boston:  Allyn  and  Bacon,  Inc.,  1973),  p.  364. 

E.  F.  Lindquist,  Design  and  Analysis  of  Experiments  in  Psychology 
and  Education  (Boston:  Houghton  Mifflin  Company,  1953),  p.  140. 


242 


Recent  Army  Research  in  Compressed  Speech 
by  Shields,    J.  L. 


ABSTRACT 
RECENT  ARMY  RESEARCH  IN  COMPRESSED  SPEECH 

Joyce  L.   Shields 

Recent  technological  advances  have  made  available  relatively  low  cost 
speech  compression  devices  with  great  potential  utility.     Methods  to  evaluate 
devices  and  procedures  to  train  people  to  "speed  listen"  are  needed  to  em- 
ploy this  technology.     To  meet  this  need  and  to  aid  Army  communications 
processors  in  decreasing  backlogs,    two  experiments  were  conducted  by 
researchers  at  the  Army  Research  Institute. 

Experiment  1.     A  technique  for  measuring  the  intelligibility  of  con- 
nected speech  was  developed  which  can  be  used  to  evaluate  speech  produced 
by  speech  compression  devices.     An  automated  tracking  method  was  used  to 
measure  the  maximum  rate  of  connected  speech  listeners  judged  they  could 
understand.     Higher  maximum  rates  were  obtained  for  compressed  speech 
(with  pitch  control)  than  for  speeded  speech  (without  pitch  control).     The 
relationship  of  this  threshold  measurement  to  comprehension  test  measure- 
ments will  be  discussed. 

Experiment  2.     Five  methods  for  training  effective  listening  of  time- 
compressed  speech  were  studied.     Army  subjects  trained  by  the  two  methods 
employing  incentives  comprehended  speech  played  at  rates  2.  2  times  faster 
than  normal  without  degradation  of  performance.     Performance  of  subjects 
trained  without  incentives  was  significantly  degraded  when  compressed 
speech  was  presented  at  rates  in  excess  of  1.  85  times  the  normal  rate. 


243 


Recent  technological  advances  have  made  available  relatively  low 
cost  speech  compression  devices  that  have  great  potential  utility. 
In  order  to  successfully  employ  this  technology,  methods  are  needed 
to  evaluate  the  required  devices  and  procedures  used  to  train  people 
to  "speed  listen."  The  present  paper  reports  the  results  of  two 
experiments  conducted  as  part  of  a  research  program  at  the  Army  Research 
Institute  for  the  Behavioral  and  Social  Sciences.   This  program  was 
designed  to  develop  evaluation  and  training  methods  to  assist  Army 
communications  processors  who  are  frequently  confronted  with  large 
amounts  of  taped  voice  material.   Army  support  for  the  present  research 
was  stimulated  by  an  earlier  study  by  Dr.  Shields  in  which  ability  of 
communications  processors  to  classify  language  communications  was  explored. 
Her  results  indicated  that  untrained  listeners  could  accurately  classify 
messages  by  subject  matter  at  compressed  rates  of  1.5  times  the  normal 
rate.   The  results  of  this  experiment  generated  further  interest  in 
investigating  rate  controlled  speech  within  the  Army. 
EXPERIMENT  I. 

The  first  experiment  was  designed  to  investigate  the  utility  of 
several  procedures  for  training  Army  personnel  to  listen  to  and  comprehend 
compressed  speech  (Lambert,  Shields,  Gade,  and  Dressel,  1976). 

The  results  of  past  experimental  attempts  to  train  comprehension  of 
compressed  speech  have  thus  far  been  equivocal.  For  example,  Foulke  (1964) 
and  Barnard  (1970)  were  not  able  to  demonstrate  gains  in  comprehension  after 
listening  experience  with  compressed  speech.   On  the  other  hand,  a  number  of 
investigators  (Grumpelt  and  Rubin,  1968;  Harley,  1966;  Klineman,  1963; 


244 


Orr,  Friedman  and  Williams,  1965,  and  Voor,  1962)  have  reported  some 
success  in  training  people  to  comprehend  time-compressed  speech. 
However,  only  one  of  these  studies  (Grumpelt,  e_t  al . ,  1968)  employed 
a  control  group  which  was  given  the  same  training  with  normal  speech 
as  the  experimental  group  prior  to  being  tested  on  comprehension  of 
compressed  speech.   Although  they  report  that  the  comprehension  of 
the  experimental  group  was  significantly  better  than  the  control,  the 
fact  that  neither  group  showed  significant  gains  between  the  pretest 
and  the  final  test  suggests  that  the  two  groups  may  have  differed  in 
comprehension  ability  prior  to  the  administration  of  the  training 
procedures.   Furthermore,  none  of  the  studies  reported  to  date  have 
systematically  manipulated  motivational  variables  in  the  course  of  their 
attempts  at  training.   The  present  experiment  was  designed  by  Dr.  Joseph 
Lambert,  with  the  support  of  Dr.  Shields,  Dr.  Gade  and  Mr.  Dressel,  to 
explore  the  effects  of  introducing  incentives  into  the  training  procedure 
under  appropriate  control  conditions. 

Five  experimental  methods  employing  different  combinations  of 
intrinsic  and  extrinsic  incentives  were  studied.   Forty-seven  Army 
enlisted  personnel  with  AGCT  scores  of  110  and  above  were  assigned  to  one 
of  five  experimental  groups.  All  groups  listened  to  the  same  selected 
passages  of  the  talking  book,  The  Proud  Tower  by  Barbara  Tuchman,  during 
five  one-hour  daily  sessions.   Each  session  was  divided  into  three  20-minute 
segments.  After  each  segment,  subjects  answered  10  multiple  choice 
questions  on  the  content.   At  the  end  of  the  week,  all  subjects  were  given 
a  standardized  comprehension  test.   This  test  was  divided  into  five  10  minute 


245 


sub-tests  (when  played  at  normal  speed) .   The  sub-tests  were  presented 
at  one  of  five  speeds:  IX,  1.5X,  1.85X,  2.2X  and  2.55X.   Recordings  were 
made,  using  an  AmBiChron  Speech  Rate  Changer  and  a  Crown  Recorder/Reproducer. 
Briefly,  the  five  experimental  groups  differed  in  the  following  ways. 

Group  A.   Point  Acquisition  and  Leave  (PAL) .   During  the  daily 
training  sessions,  subjects  were  given  points  that  counted  toward  leave  time 
based  on  (1)  whether  they  elected  to  listen  to  faster  than  normal  speech  as 
opposed  to  slower  than  normal  speech  during  each  speech  segment  and  (2)  the 
number  of  multiple  choice  questions  answered  correctly  at  the  end  of  each 
speech  segment.   If  60%  of  the  questions  were  answered  correctly,  the  rate 
of  fast  speech  for  the  next  segment  was  increased  by  13  words  per  minute. 
Subjects  were  given  immediate  feedback  regarding  their  accuracy. 

Group  B.   Point  Loss  Avoidance  and  Leave  (PLAL) .  Subjects  were 
given  the  maximum  amount  of  points  counting  toward  leave  time  at  the 
start  of  the  first  speech  segment.   Subjects  in  this  group  lost  points 
if  they  (1)  failed  to  elect  to  listen  to  compressed  speech  and  (2)  failed 
to  answer  questions  correctly  on  the  multiple  choice  test  at  the  end  of 
each  speech  segment.   If  60%  of  the  questions  were  answered  correctly,  the 
rate  of  fast  speech  for  the  next  segment  was  increased  13  words  per  minute. 
Subjects  were  given  immediate  feedback  regarding  their  accuracy. 

Group  C.   Point  Acquisition  and  No  Leave  (PANL) .   This  group  was  treated 
in  the  same  manner  as  was  Group  A,  except  that  no  leave  time  was  earned  by 
accumulating  points. 

Group  D.   Fast  Control  (FC) .   Subjects  in  this  group  listened  to  the 
same  passages  as  the  other  groups  over  the  same  time  period.   All  speech 
passages  were  presented  at  2.2  times  the  normal  rate.   No  points  were  given 


246 


and  no  leave  was  earned. 

Group  E.   Normal  Control  (NC) .   Subjects  in  this  group  listened  to  the 
same,  passages  as  all  other  groups  over  the  same  time  period.   All  speech 
passages  for  this  group  were  presented  at  the  normal  rate.   Again,  no  points 
or  leave  time  were  given. 

After  the  daily  training  sessions  were  completed,  subjects  were  given 
the  standardized  comprehension  criterion  test .   The  leave  time  earned  by 
subjects  in  Groups  A  and  B  was  not  affected  by  their  performance  on  this 
test. 

Results  and  Discussion.   Performance  of  each  of  the  experimental  groups  on 
the  criterion  test  was  compared  to  an  untrained  control  (UC)  group  of  53 
Army  subjects  drawn  from  the  same  population  of  soldiers  as  the  experimental 
subjects.  This  group  had  no  prior  training  and  only  listened  to  the  criterion 
passages  presented  at  normal  speed.   Mean  comprehension  scores  on  the 
criterion  test  for  the  untrained  control  group  and  the  experimental  groups 
are  plotted  in  Figure  1.   Statistical  analyses  indicated  that  the  two 
experimental  groups  receiving  incentives  comprehended  speech  played  at 
rates  2.2  times  faster  than  normal,  or  286  wpm,  without  degradation  of 
performance  on  the  criterion  test,  whereas  performance  of  subjects  trained 
without  incentives  was  significantly  degraded  when  compressed  speech  was 
presented  in  the  criterion  test  at  rates  faster  than  240  wpm. 

It  should  be  noted  that  the  nonincentive  groups  included  (1)  a  group 
who  merely  listened  to  speech  segments  at  normal  rates  for  one  week  with  no 
knowledge  of  their  performance  on  the  daily  probes,  (2)  a  group  who 
listened  to  compressed  speech  at  2.2  times  faster  than  normal  speed,  with 
no  knowledge  of  performance,  and  (3)  a  group  given  knowledge  of  performance 

247 


^ — i — h 


uc 


FC 


^ 1 1 


© 


UC 


\ 

\ 
PLAL 


NC 


^ — i — i — i — y 


^ — i — h 


-• 
UC 


UNTRAINED  CONTROL 


EXPERIMENTAL 


PANL 


J I I I L 


J L 


J L 


130  195  240  286  332 


130  195  240  286  332 


WORDS  PER  MINUTE 


248 


on  each  of  the  daily  probes  as  well  as  gradually  increasing  rates  of 
compressed  speech  which  were  dependent  on  the  subject's  performance. 
(Lt  is  recalled  that  the  only  difference  between  this  group,  PANL, 
and  the  point  acquisition  and  leave  group,  PAL,  was  the  lack  of 
opportunity  to  earn  leave  time.)   For  this  population,  the  results 
indicate  that  merely  giving  information  (feedback)  about  performance 
is  not  sufficient  to  insure  good  performance  (Fig.l). 

It  is  instructive  to  compare  the  pretest  data  with  those  reported 
by  Foulke  (1968)  on  listening  comprehension  as  a  function  of  word  rate  for 
untrained  listeners.   Foulke  found  that  the  comprehension  scores  of 
college  students  untrained  in  listening  to  compressed  speech  did  not 
significantly  decline  until  word  rate  was  increased  above  250  wpm.   Foulke 's 
results  are  similar  to  those  obtained  for  the  nonincentive  groups  (PANL, 
FC  and  NC)  in  the  present  study.   The  comprehension  scores  in  this  study 
declined  at  about  the  same  number  of  wpm  as  those  in  Foulke 's  study, 
whereas,  the  performance  of  those  trained  using  incentive  methods  (PAL 
and  PLAL)  did  not  drop  off  until  they  were  presented  with  rates  faster 
than  286  wpm. 
EXPERIMENT  II. 

The  second  experiment  used  a  new  psychophysical  technique  developed 
in  our  laboratory  by  Dr.  Henry  deHaan/  to  measure  the  intelligibility  of 
connected  speech.   Typically,  methods  for  the  evaluation  of  speech  compression 
techniques  have  relied  on  objective  standardized  comprehension  measures 
(Foulke,  1964  and  McLain,  1962)  or  intelligibility  measures  for  single 
words  such  as:   word  identification  (Garvey,  1953),  RT  measures  (Foulke, 
1969)  or  threshold  intensity  necessary  for  word  identification  (Calero  and 

249 


Lazzaroni,  1957).   In  deHaan's  measure,  thirty-two  Army  enlisted 
personnel  used  an  automatic  threshold  tracking  device  to  indicate  the 
maximum  rate  of  speech  they  could  understand  for  passages  of  compressed 
and  speeded  speech. 

Two  variables  were  studied:   method  of  rate  control  (compressed  vs. 
speeded)  and  rate  of  change  (2.1  wpm,  4.2  wpm,  8.4  wpm,  and  16.8  wpm) . 
Eacu  subject  listened  to  passages  of  compressed  speech  (taken  from 
Barbara  Tuchman's  The  Proud  Tower) ,  which  automatically  increased 
or  decreased  at  one  of  the  four  constant  rates  of  change.   Subjects 
were  instructed  to  press  a  hand-held  switch  when  they  could  no  longer 
understand  the  speech,  and  to  keep  the  switch  depressed,  to  insure  a 
constant  decrease  in  speech  rate,  until  speech  was  once  again  understandable. 
When  the  switch  was  not  depressed  the  rate  of  speech  gradually  increased 
at  one  of  the  four  rates  of  change. 

The  two  methods  of  rate  control  and  four  rates  of  change  were 

combined  to  make  8  experimental  conditions.   The  order  of  presentation 

of  the  experimental  conditions  was  partially  counterbalanced.   Subjects 

listened  to  all  eight  conditions  in  one  of  four  ordert: (1)  compressed 

speech  followed  by  speeded  speech  for  each  rate  of  change  with  rates  of 

change  that  progressively  decreased;  (2)  compressed  speech  followed  by 

speeded  speech  for  each  rate  of  change  with  rates  of  change  that  progressively 

increased;  (3)  speeded  speech  followed  by  compressed  speech  for  each  rate  of 

change  with  rates  of  change  that  progressively  decreased  and  (4)  speeded 

speech  followed  by  compressed  speech  for  each  rate  of  change  with  rates  of 

of  change  that  progressively  increased. 

^-A  complete  description  of  the  method  may  be  found  in  a  paper  presented 
by  Dr.  deHaan  at  the  1975  Psychonomics  Society  meeting. 


250 


a 

,13 

o 

•H 

cj 

c 

i 

iw 

01 

M 

•H 

i-i 

o 

•H 

0) 

0) 

CO 

c 

a 

OJ 

•H 

4-1 

0) 

H 

en 

03 

43 

-a 

J-i 

4-1 

ID 

u 

■H 

o 

43 

CO 

H 

& 

a 

0) 

O 

03 

4J 

CO 

c 

0) 

01 

•H 

o 

M 

J-i 

•H 

CU 

CO 

o 

ID 

4-1 

TJ 

y-i 

c 

a 

iH 

■H 

01 

O 

en 

o 

43 

C 

cn 

-a 

CO 

o 

03 

01 

H 

01 

•H 

cu 

43 

o 

u 

4-1 

J-I 

4J 

43 

JZ 

•H 

03 

CO 

4-1 

T3 

<v 

0) 

c 

T3 

a 

J-I 

0) 

o 

0) 

cd 

43 

o 

O 

X) 

4-1 

c 

03 

ex 

0) 

43 

43 

3 

0) 

u 

u 

cn 

o 

4-J 

ai 

cu 

1-1 

3 

cu 

cu 

00 

H 

a 

X 

O 

•H 

CO 

•u 

43 

co 

-a 

cj 

X! 

TD 

4-1 

cd 

0) 

XI 

01 

o 

0) 

c 

CO 

QJ 

a 

CO 

43 

J-I 

Xi 

cu 

4-1 

o 

4-1 

cu 

u 

T3 

4-1 

4-1 

ex 

•H 

cn 

3 

S 

S 

T3 

0) 

H 

o 

H 

4-1 

O 

o 

a> 

O 

cd 

CO 

T3 

€ 

43 
co 

a 

•H 

03 

C 

<u 

-a 

03 

J-i 

c 

C 

43 

•H 

03 

X) 

4-1 

CU 

T3 

QJ 

03 

g 

01 

CU 

a> 

CU 

-a 

ex 

a 

u 

cu 

co 

c 

CO 

OJ 

0) 

eg 

ex 

43 

M 

-d 

CO 

o 

CU 

0) 

QJ 

•4-1 

T3 

6 

u 

cu 

4-1 

03 

H 

o 

a 

•H 

43 

Pn 

4-1 

cn 

■a 

co 

O 


o 

O 

© 

o 

U"» 

o 

CO 

CM 

CM 

U"» 


O 
O 


o 


JinMW/SaMOM  Nl  SOIOHSI^HI  ]DN3H3JJia  $  3iniOS8V  NV3W  251 


Results  and  Discussion.   The  mean  absolute  and  difference  threshold  for 
compressed  and  speeded  speech  for  each  rate  of  change  are  plotted  in 
Figure  2.   In  general,  the  mean  absolute  threshold  did  not  vary  with 
rate  of  change  although  the  difference  threshold  varied  with  both  method 
of  speech  rate  control  and  rate  of  speech  change.   The  mean  absolute 
threshold  for  the  fastest  rate  of  change  was  statistically  significantly 
different  from  the  mean  absolute  threshold  for  the  slowest  rate  of  change  for 
both  compressed  and  speeded  speech.   The  difference  threshold  for  compressed 
speech  was  significantly  greater  than  that  for  speeded  speech  at  all 
rates  of  change  except  for  the  slowest  fate  of  change  where  the  differences 
were  nonsignificant. 

As  the  rate  of  changed  increased  the  difference  threshold  increased 
in  size  both  for  compressed  and  speeded  speech.   This  constant  increase 
in  the  difference  threshold  with  increasing  rate  of  change  was  statistically 
significant  both  for  compressed  and  speeded  speech  conditions.   In  the 
present  experiment  the  subject's  ability  to  determine  his  intelligibility 
threshold  was  most  variable  for  the  fastest  rate  of  change  in  speech  speed. 
If  reaction  time  remains  relatively  constant,  as  rate  of  change  in  speech 
speed  increases,  one  might  expect  to  find  increases  in  variability  merely 
due  to  this  artifact.   However,  the  fact  that  the  degree  of  variability  is 
greater  for  compressed  than  for  speeded  speech  suggests  that  these  increases 
in  variability  are  due  to  something  more  than  a  reaction  time  artifact. 

Subjects  participating  in  this  experiment  were  given  the  criterion  test 
given  to  subjects  in  the  first  experiment.   Pearson  correlation  coefficients 
between  absolute  thresholds  and  performance  on  the  criterion  test  were  not 
statistically  significant,  suggesting  that  the  absolute  threshold  measured 
252 


in  the  present  experiment  is  a  measure  of  intelligibility  rather  than 
comprehension. 

The  mean  absolute  threshold  for  compressed  speech  was  260  wpm  (2.1 
times  faster  than  normal  speed)  and  211  wpm  (1.7  times  faster  than  normal 
speed)  for  speeded  speech,  using  deHaan's  method  at  the  slowest  rate  of 
change.  That  is,  subjects  indicated  that  compressed  speech  presented  at 
2. IX  and  speeded  speech  at  1.7X  were  approximately  equal  in  intelligibility. 
These  results  are  similar  to  those  reported  by  Garvey  (1953)  concerning 
the  intelligibility  of  single  spondaic  words.   Garvey  (1953)  reports 
approximately  equal  intelligibility  scores  for  compressed  words  presented 
at  twice  normal  speed  (95%  correct  identification)  and  for  speeded  words 
presented  at  rates  of  1.5  times  faster  than  normal  speed  (97%  correct 
identification).   The  similarity  of  deHaan's  results  to  Garvey 's  measure 
of  intelligibility  provides  additional  support  for  the  argument  that  deHaan's 
measure  is  a  method  for  the  measurement  of  the  intelligibility  of  connected 
speech. 

In  describing  the  difference  between  intelligibility  and  comprehension 
Foulke  (1971)  has  stated: 

The  demonstration  of  comprehension  imposes  a  much  more  com- 
plex task  on  the  listener  than  the  demonstration  of  intelligibil- 
ity.  The  behavior  upon  which  the  measurement  of  intelligibility 
depends  implies  registration  of  the  stimulus  word,  some  kind  of 
short-term  memory  storage,  and  the  transduction  of  the  stored 
item  to  an  overt  response.   On  the  other  hand,  the  behavior  on 
which  the  measurement  of  comprehension  is  based  implies  continuous 
registration  and  short-term  storage  of  stimulus  material,  as  well 

253 


asvthe  continuous  encoding,  or  simplification  by  reorganiza- 
tion, and  selective  discarding  of  stimulus  material  so  that  it 
can  be  transferred  to  long-term  memory  storage,  and  a  final 
decoding  step  for  the  transduction  of  material  in  long-term 
storage  to  overt  behavior,  (p. 99) 
To  summarize,  the  interpretation  of  deHaan's  absolute  threshold 
as  a  measure  of  intelligibility  of  connected  speech  is  based  on 
(1)  lack  of  correlation  with  a  standard  measure  of  comprehension;  (2)  the 
similarity  of  deHaan's  results  and  Garvey's  measure  of  intelligibility 
for  single  words,  and  (3)  the  consistency  of  Foulke's  definition 
describing  the  measurement  of  intelligibility  and  comprehensibility  and 
deHaan's  methodology.   These  factors  support  the  contention  that  deHaan 
has  a  developed  a  new  method  for  the  measurement  of  the  intelligibility 
of  connected  speech. 

deHaan's  methodology  has  potential  application  in  areas  such  as  the 
comparison  of  speech  compression  devices,  evaluation  of  the  intelligibiliy 
of  new  algorithms  for  speech  synthesis,  etc,  which  traditionally  have  relied 
on  the  quick,  subjective  estimates  of  a  few  listeners.   This  method,  when 
used  at  rates  of  change  not  greater  than  2.1  words  per  second,  offers  a  quick, 
reliable  and  objective  technique  for  comparison  of  speech  compression  devices 
and  methods. 

The  application  of  these  training  and  evaluation  techniques  is  the  next 
phase  of  the  Army  program. 


254 


References 


Barnard,  D.  P.  A  Study  of  the  Effect  of  Differentiated  Auditory 
Presentation  on  Listening  Comprehension  and  Rate  of  Reading 
Comprehension  at  the  Sixth  Grade  Level.  Dr.  Boston,  Mass: 
Boston  U.,  1970.  D.  A.  31:  2241A,  1970.  in  Duker,  S.  Time- 
Compressed  Speech:  An  Anthology  and  Bibliography  in  Three 
Volumes.  Metuchen,  New  Jersey:  The  Scarecrow  Press,  Inc., 
1974. 

Calearo,  C.  and  Lazzaroni  ,  A.   "Speech  Intelligibility  in  Relation 
to  the  Speed  of  the  Message."  Laryngoscope  67:410-19,  May, 
1957.   Excerpts:   Time-Compressed  Speech:   An  Anthology  and 
Bibliography  in  Three  Volumes.   Metuchen,  New  Jersey:   The 
Scarecrow  Press,  Inc.,  1974. 

de  Haan,  H.  J.  'Thresholds  of  Understanding  of  Connected  Time -Compressed 
Speech.  "  In  preparation. 


Duker ,  S .   Time-Compressed  Speech:   An  Anthology  and  Bibliography 

in  Three  Volumes .   Metuchen,  New  Jersey:   The  Scarecrow  Press, 
Inc.,  1974. 

Foulke,  E.   "The  Perception  of  Compressed  Speech."   In  The  Perception 
of  Language  (edited  by  David  L.  Horton  and  James  J.  Jenkins.) 
Columbus,  Ohio:   Charles  E.  Merrill,  p.  79-107,  1971.   . 

Foulke,  E.   Comprehension  of  Rapid  Speech  By  the  Blind;  Part  III. 

Final  Progress  Report  on  Cooperative  Research  Project  No.  2430 
covering  the  period  from  March  1,  1964  to  June  30,  1968. 
Louisville,  Kentucky:   U.  Louisville,  1969.   in  Duker,  S. 
Time-Compressed  Speech:   An  Anthology  and  Bibliography  in  Three 
Volumes.   Metuchen,  New  Jersey:   The  Scarecrow  Press,  Inc.,  1974. 

Foulke,  E.   "Listening  Comprehension  as  a  Function  of  Word  Rate." 
Journal  of  Communication,  18:198-206,  1968. 

Foulke,  E.   Comprehension  of  Rapid  Speech  by  the  Blind;  Part  II. 
Final  Progress  Report  covering  the  period  from  September  1, 
1961  to  February  29,  1964  on  Cooperative  Research  Project,  No.  1370. 
Louisville,  Kentucky:   U."  Louisville,  1964.   ED003264.   Chpt.  2. 
Chpt.  3.  in  Duker,  S.   Time-Compressed  Speech:   An  Anthology  and 
Bibliography  in  Three  Volumes.   Metuchen,  New  Jersey:   The 
Scarecrow  Press,  Inc.,  1974. 

Garvey,  W.  D.   "The  Intelligibility  of  Speeded  Speech."   Journal  of 
Experimental  Psychology  45:  102-08,  1953. 


255 


Grumpelt,  H.  R.  and  Rubin,  E.   Speed  Listening  Skill  By  the  Blind  as 
a  Function  of  Training.   Report  on  a  U.  S.  Office  of  Education 
Grant  No.  OEG-3-8-080024-0021(010) .   Chesterton,  Maryland: 
Washington  College,  1968.   ED  025  092.   in  Duker,  S.   Time- 
Compressed  Speech;   An  Anthology  and  Bibliography  in  Three 
Volumes.   Metuchen,  New  Jersey:   The  Scarecrow  Press,  Inc., 
1974. 

Harley,  R.   "An  Experimental  Program  in  Compressed  Speech  at  the 

Tennessee  School  for  the  Blind."  Proceedings  of  the  Louisville 
Conference  on  Time  Compressed  Speech,  October  19-21,  1966. 
(Edited  by  Emerson  Foulke.)   Louisville,  Kentucky:   Center  for 
Rate  Controlled  Recordings  [Perceptual  Alternatives  Laboratory], 
U.  Louisville,  1967,  p. 63-66. 

Klineman,  J.   The  Effects  of  Training  Sessions  on  the  Ability  to 

Comprehend  Compressed  Speech.   Mstrs.   Pittsburgh:   U.  Pittsburgh, 
1963.   in  Duker,  S.   Time-Compressed  Speech:   An  Anthology  and 
Bibliography  in  Three  Volumes.   Metuchen,  New  Jersey:   The  Scare- 
crow Press,  Inc.,  1974. 

Lambert,  J.  V.,  Shields,  J.  L.  ,  Gade,  P.  A.,  and  Dressel,  J.  D. 

Comprehension  of  Compressed  Speech  as  a  Function  of  Training. 
ARI  Technical  Report,  1976  (in  preparation). 

McLain,  J.  R.   "A  Comparison  of  Two  Methods  of  Producing  Rapid  Speech." 
International  Journal  for  the  Education  of  the  Blind,  12:40-42, 
1962.   in  Duker,  S.   Time-Compressed  Speech:   An  Anthology  and 
Bibliography  in  Three  Volumes.   Metuchen,  New  Jersey:   The  Scare- 
crow Press,  Inc.,  1974. 

Orr,  D.  B.,  Friedman,  H.  L.,  and  Williams,  J.  C.   "Trainability  of 
Listening  Comprehension  of  Speeded  Discourse."  Journal  of 
Educational  Psychology  56:  148-56,  1965. 

Voor,  J.  B.   The  Effect  of  Practice  Upon  Comprehension  of  Time-Compressed 
Speech.   Mstrs.   Louisville,  Kentucky:   U.  of  Louisville,  1962. 
in  Duker,  S.   Time-Compressed  Speech:   An  Anthology  and  Bibliography 
in  Three  Volumes .   Metuchen,  New  Jersey:   The  Scarecrow  Press, 
Inc.,  1974. 


256 


Can  Students  in  a  Self- Paced  Course  Save  Time  and  Earn  Higher 
Grades  Using  Time-Compressed  Speech? 
by  Short,    S. 


257 


Abstract 


CAN  STUDENTS  IN  A  SELF -PACED  COURSE  SAVE  TIME  AND 
EARN  HIGHER  GRADES  USING  TIME- 
COMPRESSED  SPEECH? 


Sarah  Short 

The  purpose  of  this  study  was  to  determine,    for  the  first  time,   if 
sighted  students  would  save  time  and  achieve  higher  scores  when  listening 
to  an  entire  course  using  continuously  variable  rates  of  speed. 

The  population  consisted  of  90  college  students  enrolled  at  Syracuse 
University  during  the  Fall  semester,  1974.  The  course  had  been  systema- 
tically developed  to  be  taught  by  self-paced,  audiovisual  methods  and  had 
been  evaluated  and  revised  over  a  seven  year  period.  All  recordings  of  the 
22  modules  were  made  by  the  same  instructor  whose  average  speaking  rate 
was  150-160  wpm.  Students  were  randomly  assigned  to  use  either  variable 
speed  compressors  or  normal  speed  tape  recorders.  Posttests  were  taken 
by  computer  after  each  module,   at  the  student's  discretion. 

To  measure  elapsed  time  for  listening  to  a  module,    analog  chart  re- 
corders were  connected  to  both  normal  and  variable  speed  recorders.     To 
evaluate  achievement,   the  computer-administered  multiple -choice  cognitive 
posttests  were  used.     Analyses  of  variance  with  repeated  measures  were 
performed  on  mean  time  and  posttest  measures. 

The  group  using  variable  speed  compressors  scored  significantly 
(p  <  .01)  higher  on  posttests  when  compared  with  students  learning  the 
same  material  at  normal  speed.     Using  variable  speed  compressors  (as  com- 
pared with  normal  speed  recorders)  resulted  in  an  average  time  save  of  32  % 
and  an  average  grade  increase  of  4.  2  points  on  posttest  scores. 

Other  data  collected  during  this  study  included  results  of  affective 
Like rt- type,   listening  and  reading,   and  handedness  and  cerebral  dominence 
testing.     Seventy  percent  of  those  students  using  variable  speech  quickly 
adapted  to,   and  liked,   listening  at  faster  than  normal  rates.      Forty  percent 
of  the  students  using  compressors  regularly  listened  at  high  rates  (267- 
320  wpm).     There  was  no  significant  difference  between  the  scores  of  these 
students  and  students  who  listened  at  lower  variable  rates  of  speed  on  lis- 
tening,   comprehension,   vocabulary  or  handedness  tests.     However,    6l7oof 
students  using  high  rates  of  compression  read  at  word  rates  above  300  per 
minute.     Only  32  housing  lower  compression  rates  read  word  rates  above 
300. 

Sighted  college  students  enrolled  in  a  course  taught  by  self-instruction 
methods,   like  to  use  variable  speech  compressors,    earn  significantly  higher 
achievement  scores,   and  save  significant  amounts  of  time  when  using  this 
equipment  as  compared  with  students  using  normal  speed  tape  recorders  to 
learn  the  same  information. 


258 


Can  Students  in  a  Self-Paced  Course  Save  Time 
and  Earn  Higher  Grades  Using 
Variable  Time  Compressed  Speech? 

by 

Sarah  H.  Short,  Ph.D.,  Ed.D. 

Introduction 

Because  of  the  knowledge  explosion,  more  information  is  being  taught  and 

learned.  According  to  Duker  (1974),  the  amount  of  aural  communication  far 

exceeds  the  amount  of  written  communication.   This  aural  communication  is  now 

being  reproduced  on  tapes  which  are  available  in  libraries  as  well  as  used 

directly  for  instruction. 

Some  students  seem  to  learn  easier  by  listening  while  others  learn  better 
by  reading  (Sticht,  1974).   However,  when  large  amounts  of  cognitive  information 
are  presented  aurally,  the  student  must  listen  at  the  lecturer's  rate  of  speech 
which  may  not  be  the  best  rate  for  the  student.   School  classroom  lecturers 
average  only  100  words  per  minute  according  to  Nichols  while  the  speed  of  thought 
may  be  400  to  800  words  per  minute  (wpm) .   This  percipitates  inattentiveness. 
One  solution  to  this  problem  is  the  use  of  variable  rate  controlled  speech  tapes, 
so  that  students  may  continuously  vary  the  rate  at  which  they  hear  the  message 
and  the  pitch  is  kept  normal.   Foulke  and  Sticht  (1969)  published  an  excellent 
review  of  research  preformed  with  accelerated  speech. 

Until  1972,  it  was  possible  to  compress  tapes  only  by  using  expensive  equipment. 
This  was  difficult  to  operate  and  was  not  easily  adapted  to  varying  the  rate  of 
speech  on  the  recorded  tapes.   By  1972,  the  second  generation  of  compression 
equipment  hadl  been  built  which  allowed  the  compressor  operator  to  continuously 
vary  the  tape  speed.   These  variable  rate  speech  compressors  used  solid  state 
integrated  circuitry  making  the  equipment  smaller,  easier  to  use  and  less  expensive. 


Paper  presented  at  the  Third  Louisville  Conference  on  Rate-Controlled  Speech, 

November  3-5,  1975,  University  of  Louisville,  Louisville,  Kentucky. 

Dr.  S.  Short,  200  Siocum  Hall,  Syracuse  University,  Syracuse,  New  York  13210. 

259 


Since  variable  speed  compressors  have  become  available  to  use  on  an  individual 
basis,  research  done  on  precompressed  tapes  has  become  useful  primarily  as 
background  material.   Therefore,  new  questions  had  to  be  raised  concerning 
learning  at  a  preferred  rate  of  listening. 

Purpose  of  This  Study 

The  pirpose  of  this  study  was  to  examine  the  relationship  between  time  spent 
using  variable  rate  speech  reproduction  and  cognitive  performance  as  measured  by 
achievement  postests.   All  previous  research  had  been  done  using  precompressed 
tapes.   This  study  differs  from  previous  studies  in  that  it  allowed  college 
students  to  select  and  adjust  the  speed  they  preferred  for  taped  modules  in  a 
course  taught  by  self-instruction  methods.   For  the  first  time,  students  were 
able  to  listen  to  a  speaker  at  a  rate  commensurate  with  individual  processing 
abilities  rather  than  at  the  speaker  determined  rate  or  a  predetermined  rate  of 
speech  compression. 

The  following  questions  were  posed  for  this  study: 

1.  Can  sighted  college  students  save  time  through  the  use  of  variable  rate 
speech  compressors  as  compared  to  normal  speed  tape  recorders  when  learning 
cognitive  information  presented  by  audio  tapes? 

2.  Will  sighted  college  students  using  variable  rate  speech  compressors  to 
listen  to  cognitive  information  achieve  higher  grades  on  module  posttests 
than  the  students  listening  to  the  same  material  on  normal  speed  tape 
recorders? 

Significance  of  the  Problem 

The  question  of  saving  time  had  never  been  satisfactorily  answered  because 
precise  measuring  techniques  were  not  used  in  any  of  the  previous  research.   No 
carefully  controlled  studies  had  been  performed  which  demonstrate  that  students 


260 


learn  significantly  more  cognitive  information  using  precompressed  speech  tapes 
than  students  learn  by  normal  speed  tapes.   This  study  was  done  to  establish, 
with  precise  time  measurements  and  carefully  constructed  posttests,  whether  or 
not  sighted  students  learn  more  and  save  time  using  variable  time  compressed  speech 
tapes  as  compared  with  normal  speed  tapes  in  an  entire  college  course. 

Educational  Practices  Using  Compressed  Speech 

Educators  have  used  precompressed  tapes  in  an  informal  way  or  in  a  one-time 
experimental  mode,  but  few  have  done  controlled  studies  with  an  entire  course 
for  sighted  students  over  a  period  of  time.   Some  explorations  have  been  concerned 
with  the  best  rate  of  speech  for  the  student  (Eckhardt,  1970;  Gleason,  Callaway, 
&  Lakota,  1974) ;  some  with  comprehension  and  retention  (Perry,  1970) ;  some  with 
the  student  attitudes  (Libby,  1971)  and  some  with  the  merits  of  compressing  tapes 
for  lecture  review  (Boyle,  1969;  Tonra,  1972). 

Rossiter  (1971)  tested  college  students  on  comprehension  of  14  different 
selections  but  each  message  lasted  only  one  and  one-half  minutes,  giving  students 
very  little  time  to  acclimate  themselves  to  listen  to  compressed  speech. 

Hass  (1974)  found  that  students  listen  to  lecture  tapes,  precompressed  50 
percent  by  the  pause  deletion  method  performed  better  on  the  questions  based  on 
the  tapes  than  on  questions  based  on  the  live  lectures. 

Sarenpa  (1971)  studied  a  biology  course  taught  by  Audio-Tutorial  methods. 
One  group  of  29  subjects  listened  to  the  22  modules  at  a  normal  speed  (from  113-138 
wpm)  while  another  group  of  28  subjects  heard  the  same  lessons  at  193  to  238  wpm 
precompressed  by  pause  deletion  method.   There  was  no  significant  difference  in 
achievement  between  the  two  groups  and  the  compressed  group  did  show  a  slight 
saving  in  time  over  the  normal  speed  group.   However,  the  control  for  reporting 
time  was  in  the  hands  of  the  students. 


261 


Challis  (1973)  at  the  University  of  Oklahoma  used  students  enrolled  in  a 
basic  audiovisual  course  to  determine  if  compressed  speech  tapes  could  be  used 
in  an  independent  study  laboratory.   There  was  no  significant  difference  on 
achievement  scores  between  subjects  with  high  grade  point  averages  and  those 
with  low  grade  point  averages  using  the  same  rates  of  speed.   From  student 
tabulated  time  sheets,  it  was  concluded  that  those  using  tapes  precompressed  by 
30  perceit  had  a  17  percent  saving  of  time  while  students  using  tapes  compressed 
by  40  percent  saved  31  percent  over  the  normal  listening  time. 

All  of  the  previous  studies  used  precompressed  tapes  for  the  students. 
Rome  (1972)  placed  a  prototype  model  of  a  variable  speech  compressor  in  the 
Audiovisual  and  Television  Center  of  the  Western  Connecticut  State  College  for 
students  to  used  when  reviewing  lectures.   However,  no  achievement  data  was 
collected. 


Related  Studies  Using  Precompressed  Speech  Tapes  in  the  Nutrition  Department 
at  Syracuse  University 

At  Syracuse  University,  a  basic  course  in  foods  and  nutrition  has  been  offered 
since  1967  using  self-instruction  methods  with  audio  tapes,  video  tapes,  slides, 
8  mm  films,  computer  assisted  instruction  and  workbooks  (Short,  1969,  1970,  1975). 
Pre-  and  post-achievement  tests  have  been  been  validated  and  administered  each 
semester.   Affective  questionnaires  and  tests  have  been  used.   Compressed  speech 
tapes  were  first  considered  for  this  course  when  space  was  needed  in  the  learning 
laboratory.   It  was  thought  that  students  would  save  time  while  learning  thereby 
making  room  for  more  students  to  use  the  small  learning  laboratory.   This  hypo- 
thesis led  to  the  formation  of  several  questions:  (1)   would  students  accept  speeded 
tapes?   (2)  would  comprehension  and  retention  be  as  good  as  with  normal  speed  tapes?, 
and  (3)  would  the  students  actually  save  time  so  that  students  in  other  nutrition 
courses  could  share  the  laboratory? 

Studies  were  conducted  in  both  semesters  of  1970,  1971  and  1972  (Short,  1971, 


262 


1973,  1974,  1975)  using  tapes  compressed  by  20%,  30%,  40%  and  55%.   Pre  and  post- 
tests  were  administered,  and  both  affective  and  personality  tests  were  used. 
Posttests  administered  after  a  unit  of  instruction  and  up  to  eleven  weeks  after 
a  unit  showed  no  significant  difference  between  those  listening  to  normal  tapes 
and  those  listening  to  any  rate  of  pre-compressed  tapes.   The  results  of  the 
Likert-type  affective  questionnaire  indicated  that  students  liked  using  20  or  30 
percent  compressed  tapes.   When  students  had  a  choice  in  compression  speeds,  75 
percent  chose  compressed  tapes  and  indicated  that  they  would  like  to  use  30  percent 
compression  for  the  rest  of  the  semester.   No  correlation  was  found  between  any 
factor  on  the  California  Personality  Inventory  and  achievement  scores  or  with 
results  of  a  Likert  type  questionnaire.   Using  closed  circuit  television  to 
unobtrusively  monitor  student  behavior  (Steffan,  1971)  it  was  judged  that  students 
did  save  time  using  precompressed  tapes  as  compared  with  normal  speed  tapes. 

It  was  concluded  from  these  studies  using  precompressed  tapes,  that  students 
in  a  basic  food  and  nutrition  course  did  learn  cognitive  information  and  saved 
time,  using  student  controlled  time  measurement. 

Methods 

In  Nutrition  and  Food  Science  115  at  Syracuse  University  during  the  fall 
semester  of  1974,  90  students  were  divided  into  two  treatment  groups  by  using  a 
random  number  table.   None  of  the  students  indicated  that  they  were  aware  of  a 
hearing  disability.   Students  from  all  four  classes  (Freshman,  Sophomores,  Juniors 
and  Seniors)  were  equally  divided  between  the  treatment  groups.   An  analysis  of 
variance  was  calculated  for  the  first  eight  module  posttests  to  ascertain  if  there 
were  significant  differences  in  student  posttests  achievement  before  the  treatment 
was  started.   This  was  a  self-paced  course,  and,  therefore,  students  started  using 
the  normal  cassette  tape  recorders  and  the  compressors  at  various  points  in  the 
course.   However,  no  student  started  using  the  new  equipment  until  module  nine 


263 


and  so  this  analysis  of  variance  was  calculated  using  the  first  eight  module 
posttest  results. 

The  analysis  of  variance  for  treatment  groups  on  scores  from  the  first  eight 
modules  is  presented  in  Table  1.   The  means  of  these  posttests  are  presented  in 
Table  2. 


Table  1 
Analysis  of  Variance  for  Scores  from  First  Eight  Modules 

Source                        df           MS 

F 

Between  Subjects 
A  (treatments) 
Subjects  within  groups 


89 

1  1827.24 

88  792.71 


2.305 


Within  Subjects 
B    (Modules) 

AB 

A  x  Subjects  within 
groups 


630 

7  1796.96 

7  192.28 


616 


141.72 


12.68* 


1.36 


*p_     <     .05 


264 


Table  2 
Means  of  First  Eight  Module  Posttests 


Posttest  Scores 


Module  Number  Normal  Compressed 

1  80.67  89.56 

2  88.22  88.89 

3  88.89  88.00 

4  81.11  85.11 

5  83.78  84.22 

6  76.67  78.67 

7  79.56  85.33 

8  74.29  76.89 


Results  showftin  Table  1  indicate  that  there  was  no  significant  difference  at 
the  .05  level  between  students  in  the  two  treatment  groups  on  means  of  posttests 
taken  after  the  first  modules  before  treatment  started.   Therefore,  the  results 
reported  on  data  collected  after  the  students  used  different  tape  recorders  may  be 
due  to  the  treatment. 

Table  1  showed  that  there  was  a  significant  difference  between  modules  but  this 
is  irrelevant  since  some  modules  may  be  more  difficult  than  others,   There  is  no 
significant  module  by  treatment  interaction. 

The  design  of  this  study  was  the  pretest-posttest  control  group  design 
(Design  4)  from  Campbell  and  Stanley  (1963) .   The  treatment  sessions  were  run 
simultaneously  and  had  the  same  instructor  since  the  treatment  involved  listening 
to  the  same  information.   A  normal  routine  in  the  learning  laboratory  was  followed 
for  both  groups.   The  equipment  used  by  both  groups  looked  very  similar  and  the 
instruments  used  to  measure  time  were  placed  in  an  unobtrusive  position.   Reactive 

265 


arrangements  were  kept  to  a  minimum. 

The  independent  variable  in  this  study  was  the  speech  compression.   The  two 

categories  were:   variable  speech  compression  and  no  compression.   The  dependent 

variables  were:   (1)  the  time  the  student  used  to  listen  to  a  module  of  instruction 

and  (2)  the  scores  the  student  achieved  on  cognitive  tests. 

•  ■ 
A  cognitive  pretest  consisting  of  100  multiple  choice  questions  revised  over 

a  seven  year  period  was  given  during  the  first  class  period.   The  posttests  for 

each  of  uhe  22  modules  in  the  course  were  administered  on  time  sharing  computer 

terminals  at  the  student's  convenience.  '. 

In  order  to  precisely  measure  the  time  each  student  spent  listening  to  taped 
information  and  the  rate  of  speed  chosen   for  listening,  all  tape  recorders  were 
connected  to  graphic  recorders.   These  analog  chart  recorders,  placed  so  that  they 
would  be  unobtrusive  in  the  learning  laboratory,  made  a  record  of  events  (tape 
recorder  "off"  or  "on")  and  voltage  levels  (compressor  rate  settings) .   The  time 
was  noted  as  minutes  and  quarter  parts  of  minutes. 

The  course  work  contained  basic  nutrition  and  food  science  concepts  in  the 
Department  of  Nutrition  at  Syracuse  University.   The  instructor  taped  the  informa- 
tion so  that  it  would  sound  as  if  the  student  were  being  tutored  on  an  individual 
basis.   The  word  rate  on  the  tapes  was  calculated  to  be  from  150-160  wpm  and  the 
length  of  the  taped  modules  was  from  12  minutes  19  second  to  25  minutes  51  seconds. 

For  this  study,  four  variable  speed  tape  recorders  and  four  normal  speed  tape 
recorders  were  used.   The  variable  speed  machines  consisted  of  three  Copycorder 
Model  103  compressors  and  one  Varispeech-I  compressor.   The  normal  speed 
equipment  consisted  of  four  Sony  cassette  tape  recorders.   On  the  variable  com- 
pressors, the  speech  rate  could  be  continuously  varied  by  the  student.   All  students 
were  given  the  multiple  choice  pretest  during  an  introductory  class  session 
where  procedures  were  explained.   During  the  fifth  week  of  the  semester,  students 
were  informed  that  they  had  been  randomly  assigned  to  one  of  two  groups  in  order 
to  test  two  types  of  tape  recorders.   The  control  group  (Group  I)  would  use 


266 


cassette  tape  recorders  and  the  experimental  group  (Group  II)  would  use 
variable  speed  cassette  tape  recorders.   Each  tape  recorder  used  for  the  experi- 
ment was  connected  to  an  analog  chart  recorder  unobtrusively  placed  in  the  learning 
laboratory. 

The  chart  recorder  paper  tapes  from  both  the  normal  speed  and  variable  speed 
compressor  tape  recorders  were  analyzed  for  time  elapsed  (time  the  student  started 
listening  to  the  time  the  student  finished  the  module) .   The  paper  tape  recorded 
from  the  compression  equipment  also  measured  the  rate  changes  used  by  the  student. 
A  record  was  kept  of  students  who  predominantly  used  normal  to  low  compression 
(compressed  by  20-30  percent) ,  those  who  used  medium  rates  (compressed  by  30-40 
percent)  and  those  who  used  high  rates  of  compression  (compressed  by  40-50  percent) 
for  large  blocks  of  time  during  each  tape. 

The  mean  time  the  experimental  group  spent  listening  to  tapes  on  variable 
speed  compressors  was  compared  to  the  mean  time  the  control  group  spent  listening 
to  normal  speed  tapes.   This  was  done  by  a  two-factor  analysis  of  variance  with 
repeated  measures  on  one  factor  (Winer,  1971).   The  means  of  achievement  posttest 
scores  of  the  two  groups  were  compared  in  the  same  manner.   The  .01  level  of 
significance  was  chosen  to  test  the  hypotheses.   In  all  of  the  related  compressed 
speech  studies  performed  in  the  Nutrition  Department  learning  laboratory  since 
1970,  a  more  liberal  significance  level  was  chosen  to  avoid  overlooking  any 
important  relationship  that  might  exist.   Although  no  significant  difference  was 
found  in  any  of  these  related  studies,  trends  were  noticed.   Students  allowed  to 
use  precompressed  tapes  saved  time  and  did  as  well  (and  in  some  cases  better)  than 
when  they  were  allowed  to  use  normal  speed  tapes  or  when  compared  to  other 
students  using  normal  speed  tapes.   In  this  study,  a  more  rigorous  significance 
level  was  chosen  to  decrease  the  probability  of  rejecting  a  true  null  hypothesis. 

Results 

The  amounts  of  time  spent  listening  to  seven  different  modules  by  the  two 
treatment  groups  was  measured  by  connecting  analog  chart  recorders  to  the 

267 


variable  speech  compressors  and  to  the  normal  tape  recorders  to  record  the  time 

spent  by  each  student  on  each  module.   The  results  of  the  analysis  of  variance 

performed  on  these  data  are  presented  in  Table  3.  The  means  for  the  seven  modules 
are  presented  in  Table  4. 

Table  3 

Analysis  of  Variance  for  Time  Spent  Listening  to 

Seven  Different  Modules 


Source 


df 


MS 


Between  Subjects 
A  (treatments) 
Subjects  within  groups 


89 

1 

8S 


5403.702 
267.566 


20.196** 


Within  Subjects 

B  (Modules) 

AB 

B  x  Subjects  within 
groups 


540 
6 
6 

528 


2222.604 
47.906 

31.896 


69.684** 
1.502 


.01 


268 


Table  4 
Means  of  Time  Spent  Listening  to 
Seven  Different  Modules 


Posttest  Scores 


Module  Number  Normal         Compressed 

15  33:22  26:31 

16  28:02  23:33 

17  32:01  24:01 

18  33:08  28:02 

19  23:12  17:32 

20  24:17  20:09 

21  38:13  30:04 


The  analysis  of  variance  for  the  between-subject  variable,  F_  (1,88)  =  20.196, 
falls  within  the  .01  region  of  rejection  for  a  one-tailed  test.   The  null  hypothesis 
can  be  rejected  at  the  .01  level  of  significance.   The  analysis  of  variance  also 
indicated  a  significant  within-subjects  variable,  F  (6,528)  *  69.684,  p  <  .01, 
indicating  that  some  modules  required  more  time  than  others.   There  is  no  statis- 
tically significant  module-by-treatment  interaction.   This  means  that,  regardless 
of  the  differences  in  modules,  the  compressed  group  saved  time. 

From  an  inspection  of  the  means  presented  in  Table  4,  it  may  be  seen  that 
the  compressed  group  spent  less  time  listening  to  each  module.   The  time  saved  in 
each  module  was  calculated  by  subtracting  the  means  of  the  two  treatment  groups. 
The  average  time  saved  was  6  minutes  and  26  seconds.   The  average  taped  time  (when 
heard  on  a  normal  speed  tape  recorder)  of  these  modules  was  19  minutes  and 
36  seconds,  resulting  in  an  average  time  saved  of  32.34  percent. 

269 


The  posttest  scores  were  collected  during  computer  testing  on  each  of  the 
seven  modules.   The  results  of  the  analysis  of  variance  on  repeated  measures  were 
performed  on  these  data  and  are  presented  in  Table  5.   The  means  for  the  seven 
modules  are  presented  in  Table  6. 

Table  5 
Analysis  of  Variance  for  Posttest  Scores  on 
Seven  Different  Modules 


Source 


df 


MS 


Between  Subjects 
A  (treatments) 
Subjects  within  groups 


89 

1 

88 


2782.50 
394.74 


7.049** 


Within  Subjects 

B  (Modules) 

AB 

B  x  Subjects  within 
groups 


540 
6 
6 

528 


1701.58 

178.17 

89.92 


18.923** 
1.981 


;P  <  .01 


270 


Table  6 
Means  of  Posttest  Scores  on 
Seven  Different  Modules 


Posttest  Scores 


Module  Number  Normal        Compressed 

15  87.11  92.22 

16  86.44  90.22 

17  81.11  86.89 

18  86.33  88.89 

19  -  91.67  93.11 

20  92.56  94.00 

21  76.47  85.78 


The  analysis  of  variance  for  the  between-subject  variable,  F_  (1,88)  =  7.049, 
falls  within  the  .01  region  of  rejection  for  a  one-tailed  test.   The  null  hypothe- 
sis was,  therefore,  rejected  at  the  .01  level  of  significance.   The  analysis  of 
variance  also  indicated  a  significant  within-subject  variable,  F_  (6,528)  =  18.293, 
p_   <^  .01,  which  may  mean  that  some  modules  were  more  difficult  than  others.   There 
was  no  statistically  significant  module-by-treatment  interaction.   This  implies 
that,  regardless  of  differences  in  the  modules,  the  compressed  group  achieved 
significantly  higher  scores  than  the  group  using  normal  speed  tape  recorders. 

From  an  inspection  of  the  means  presented  in  Table  6,  it  may  be  seen  that  the 
compressed  group  achieved  higher  scores  than  the  group  using  normal  speed  tape 
recorders.   The  difference  in  means  between  the  normal  and  compressed  groups  on 

271 


module  posttests  ranged  from  1.4  to  9.3  grade  points  reulting  in  an  average  grade 
point  increase  of  4.2  for  the  compressed  group. 

Discussion  and  Conclusions 


Unlike  most  experimental  studies  of  media  variables,  more  than  one  lesson  was 
involved.   This  study  was  based  not  on  one,  but  on  seven  modules.   The  experimental 
and  contiol  groups  were  treated  exactly  the  same  except  that  the  experimental  group 
was  allowed  to  use  time  compressed  speech  equipment  which  looks  similar  to  the 
ordinary  tape  recorders  which  the  control  group  was  using.   Since  the  two  groups 
were  treated  similarly,  it  seems  that  the  stimulating  effect  of  being  singled  out 
for  special  treatment  (Hawthorne  effect)  was  well  controlled. 

This  study  showed  that  students  using  variable  speech  compressors  achieved 
significantly  higher  scores  on  posttests  than  students  using  normal  speed  tape 
recorders.   The  average  difference  was  almost  half  a  grade,  while  the  maximum 
difference  in  means  was  almost  a  whole  grade  higher  for  the  compressed  group. 
Students  using  variable  speed  compressors  as  compared  with  normal  speed  tape  re- 
corders might,  for  example,  be  expected  to  raise  their  grade  from  a  B  to  a  B+  or 
even  to  an  A.   There  was  no  significant  difference  in  the  means  of  scores  on  the 
first  eight  modules  before  treatment  was  started.   It  was  not  until  one  group 
started  using  speech  compressors  that  a  significant  difference  was  observed  in 
the  posttest  scores  which  implies  that  the  speech  compressors  did  make  a  difference. 

Data  collected  (Short,  1974)  on  self-paced  module  posttesting  by  computer 
showed  15  to  22  times   as  many  A  grades  as  when  students  were  tested  by  the  same 
tests  given  at  stated  times  throughout  the  semester  using  paper  and  pencil.   It 
might  be  said  that  the  results  of  the  present  study  were  reflecting  the  high 
scores  observed  when  students  take  tests  on  their  own  schedule  by  computer.   If 
the  results  of  comparing  the  first  eight  module  posttests  are  again  reviewed,  it 
will  be  seen  that  there  was  not  a  significant  difference  in  posttest  scores  until 


272 


the  compressors  were  used. 

This  study  also  showed  that  significant  amounts  of" time  were  saved  using 
variable  speed  compressors.   It  had  been  assumed  that  time-saving  would  be  a 
result  of  using  time  compressed  speech,  and,  therefore,  had  never  been  precisely 
measured.   In  an  independent  learning  laboratory  using  self -instruction  or  audio- 
tutorial  methods,  one  of  the  major  differences  from  conventional  methods  is  allow- 
ing students  to  repeat  tapes  and  view  slides  as  often  as  necessary  to  meet  be- 
havioral objectives.   In  related  studies  in  the  Nutrition  learning  laboratory, 
many  students  using  precompressed  tapes  felt  that  they  did  not  save  time  because 
they  had  to  stop  the  tape,  rewind  and  listen  again  in  order  to  understand  the 
information  presented.   In  this  study,  time  measurement  was  removed  from  students' 
control  and  monitored  by  machine  (analog  chart  recorders) .   The  actual  time  saved 
by  the  use  of  variable  speech  compressors  was  an  average  of  32  percent  as  com- 
pared to  using  normal  speed  tapes.   This  may  be  important  for  students  who  need 
the  time  for  other  courses  and  for  the  instructor  who  needs  to  use  the  learning 
laboratory  for  other  classes. 

The  study  also  suggests  that  students  did,  in  fact,  use  the  speech  compression 
technology  that  was  made  available  to  them.   Some  students  used  the  compressor 
capabilities  more  than  others. 

The  following  conclusions  were  drawn  from  the  analyses  and  evaluation  of  data: 

1.  Sighted  college  students  enrolled  in  a  basic  course  taught  by 
self-instruction  methods,  earn  significantly  higher  achievement 
scores  when  variable  speed  compressors  are  used  to  listen  to  the 
taped  modules  than  when  normal  speed  tape  recorders  are  used  to 
listen  to  the  same  cognitive  information. 

2.  Sighted  college  students  enrolled  in  a  basic  course  taught  by  self- 
ir.struction  methods  save  significant  amounts  of  time  when  variable 
speed  compressors  are  used  to  listen  to  the  tapes  than  when  normal 
speed  tape  recorders  are  used. 

273 


Post  Hoc  Findings 

During  this  study,  while  observations  were  being  noted  concerning  amounts  of 
time  spent  and  scores  earned  by  both  treatment  groups,  an  attempt  was  made  to 
explain  some  of  the  student  characteristics  while  using  compressed  speech.   Tests 
and  questionnaires  used  included  the  Likert-type  questionnaire,  the  Brown-Carlson 
Listening  Comprehnsion  Test,  the  Nelson-Denny  Reading  test,  and  the  handedness 
and  Torq  le  test  as  described  by  Blau  (1974) . 

From  the  results  of  these  evaluations  the  following  conclusions  were  made: 

1.  The  majority  (70  percent)  of  the  students  using  variable  speech 
compressors  liked  listening  to  tapes  at  faster  than  normal  rates, 
learned  to  understand  time  compressed  speech  quickly  (69  percent) 
and  liked  the  idea  of  speeding  up  the  time  needed  to  complete 
modules  in  the  learning  laboratory  (81  percent) 

2.  There  was  no  correlation  between  student  ranking  on  a  listening 
tests,  comprehension,  and  vocabulary  reading  tests  when  comparing 
students  who  used  high  rates  of  speed  when  listening  to  tapes  and 
those  who  used  medium  to  low  rates  of  speed. 

3.  A  higher  percentage  of  students,  reading  at  300  or  more  words  per 
minute,  used  high  speeds  of  compression  (compressed  by  40-50%)  as 
compared  with  students  who  listened  at  low  (20-30  percent)  or 
medium  (30-40  percent)  speeds. 

4.  There  was  no  significant  correlation  between  students  using  high 
rates  of  compression  and  those  using  lower  rates  when  handedness 
or  laterality  of  students  was  observed. 

This  study  attempted  to  investigate  the  efficiency  and  effectiveness  of 
variable  rate  speech  compression  in  a  context  and  manner  that  would  be  convincing 
to  teachers,  administrators,  and  researchers.   Variable  rate  time  compressed 
speech  is  a  tool  to  aid  students  to  listen  and  learn  at  their  own  best,  or 
274  preferred,  rate. 


Blau,  T.H.,  The  Sinistre  Child,  paper  presented  at  the  American  Psychological 
Association,  September  2,  1974. 

Boyle,  G.J.,  Compressed  speech  in  medical  education.   Proceedings  of  the  Second 
Louisville  Conference  on  Rate  and/or  Frequency-Controlled  Speech,  1969,  328-330. 

Campbell,  D.T.  &  Stanley,  J.C.,  Experimental  and  quasiexperimental  designs  for  re- 
search.  Chicago:   Rand  McNally. 

Challis,  A.J.   The  effect  of  fixed  and  learner  selected  rates  of  compressed  speech 
in  an  audio-tutorial  learning  environment  on  the  achievement  of  college  level 
students.   Doctoral  dissertation,  University  of  Oklahoma,  1973.   Dissertation 
Abstracts  International,  34,  1973,  3908-A.   University  Microfilm  No.  73-26, 
313. 

Duker,  S.  Time  compressed  speech:   an  anthology  and  bibliography  in  three 
volumes.   Metuchen,  N.J.:   The  Scarecrow  Press,  Inc.,  1974. 

Eckhardt,  W.W. ,  Learning  in  Multi-Medial  Programmed  Instruction  as  a  Function 

of  Aptitude  and  Instruction  Rate  Controlled  by  Compressed  Speech  D.A.,  31:2249A, 
1970. 

Foulke,  E.  &  Sticht,  T.G. ,  Review  of  research  on  the  intelligibility  and  compre- 
hension of  accelerated  speech.   Psychological  Bulletin,  72,  1969,  50-62. 

Gleason,  G. ,  Calloway,  R.  and  Lakota,  R. ,  Effects  of  Audio  Rate  Compression  on 
Student  Comprehension  and  Attitudes"  in  Duker,  S.  (ed.)  Time  Compressed  Speech: 
An  Anthology  and  Bibliography,  1974. 

Hass,  B.M.,  Time-compressed  speech  in  use.   Collegiate  News  and  Views,  17,  Spring, 
1974. 

Libby,  J. A. ,  Compressed  Speech  Used  in  Auto-tutorial  Course  Instruction,  CRCR 
Newsletter,  1971,  5,  3. 

Perry,  T.K.,  The  effects  upon  the  learner  of  a  compressed  slide-audio  tape  presenta- 
tion experienced  in  a  learning  carrel  as  measured  by  recall  and  application  tests. 
CRCR  Newsletter,  4,  1970,  (7),  1-2. 

Rossiter,  CM.,  Rate-of-presentation  effects  on  recall  of  facts  and  ideas  and  on 
generation  of  inferences.   AV  Communication  Review,  19,  1971  (3),  313-324. 

Rome,  S.,  The  Variable  Speed  Compressor,  speech  presented  at  National  Convention  of 
Association  for  Educational  Communication  and  Technology,  1972,  Minneapolis. 

Sarenpa,  D.E.,  A  comparative  study  of  two  presentations  of  rate  controlled  audio 

instruction  in  relation  to  certain  student  characteristics.   Doctoral  dissertation, 
University  of  Minnesota,  1971.   Dissertation  Abstracts  International,  32,  1971,  1199-A, 

Short,  S.H.,  The  development  and  comparative  evaluation  of  a  course  in  basic  nutrition 
and  food  science  taught  by  self-instruction  methods.   Doctoral  dissertation, 
Syracuse  University,  1970.   Dissertation  Abstracts  International,  31,  1971. 

Short,  S.H.,  Innovations  in  nutrtition  education.   Audiovisual  Instruction,  16,  1971, 
19-21. 

275 


Short,  S.H.,  Audio  speedteach.   Media  and  Methods,  9,  1973,  63-68. 

Short,  S.H.,  Media  in  teaching  college  level  nutrition.   Journal  of  the  American 
Dietetic  Association,  66,  1975,  581-587. 

Short,  S.H.,  Hough,  0.,  Dibble,  M.V.  &  Sarenpa,  D. ,  Development  and  utilization 
of  a  self-instruction  laboratory.   Journal  of  Home  Economics,  61,  1969,  40-44. 

Steffan,  R.F.,  Unobtrusive  observation  of  student  non  verbal  behavior  in  audio- 
tutorial  self-instruction.   Doctoral  dissertation,  Syracuse  University. 
Dissertation  Abstracts  International,  32,  1972. 

Sticht,  V.B.,  Studies  on  the  efficiency  of  learning  by  listening  to  time-compressed 
speech:   an  anthology  and  bibliography  in  three  volumes,  II.   Metuchen,  N.J.: 
The  Scarecrow  Press,  1974. 

Tonra,  R. ,  Compressed  speech  used  in  introductory  course.   CRCR  Newsletter,  6, 
1972  (4),  1-2. 

Winer,  B.J.,  Statistical  principles  in  experimental  design.   New  York:   McGraw  Hill, 
1971. 


276 


The  Effect  of  Fixed  and  Learner  Selected  Rates  of  Compressed 
Speech  in  an  Audio- Tutorial  Learning  Environment  on  the  Achievement 
of  College  Level  Students 
by  Challis,    A.  J. 


ABSTRACT 

This  study  investigated  the  use  of  compressed  audio  tapes  in  an  audio- 
tutorial  portion  of  an  on-going,  full  semester  course  in  the  production,  selection, 
and  utilization  of  audio  visual  materials  and  equipment. 

Both  cognitive  and  affective  evaluations  were  conducted  1. )  to  determine 
if  the  application  of  compressed  speech  recordings  in  a  learning  environment  is  an 
academically  sound  procedure,  2.)  to  assess  student  preference  in  regard  to  this 
medium,  and  3. )  to  collect  data  which  could  serve  as  the  basis  for  decisions 
regarding  the  academic  practicability  of  compressed  speech  as  an  additional  tool 
of  learning. 

Although  no  significant  difference  in  achievement  was  found,  students  did 
save  up  to  31%  of  the  available  time. 

Students  expressed  favorable  attitudes  toward  use  of  compressed  speech 
both  as  a  primary  mode  for  learning  subject  matter  and  as  a  technique  for  review. 
Most  felt  that  learner  control  over  the  rate  of  compression  was  necessary  or 
desirable  for  a  most  satisfactory  learning  experience. 

It  was  concluded  that  compressed  speech  is  an  academically  practical 
and  effective  tool  of  learning. 


277 


THE  EFFECT  OF  FIXED  AND  LEARNER  SELECTED  RATES  OF 
COMPRESSED  SPEECH  IN  AN  AUDIO-TUTORIAL  LEARNING 
ENVIRONMENT  ON  THE  ACHIEVEMENT  OF 
COLLEGE  LEVEL  STUDENTS 


A.  James  Challis 
Miami  University 


INTRODUCTION 


The  "Knowledge  Explosion,"  about  which  we  have  heard  so  much  in  the 
past  few  years,  has  established  the  conditions  for  the  educational  problem  which 
this  investigation  addresses.    Is  it  possible  to  enlist  current  technology  to  maxi- 
mize available  time  in  the  teaching/learning  environment,  while  at  the  same  time 
maintaining  or  increasing  the  current  academic  achievement  level  ?    It  is  technic- 
ally possible  to  incorporate  speech  compression  features  in  any  audio/listening 
situation,  to  include  dial  access  information  retrieval  systems.    The  addition  of 
this  capability  at  institutions  where  students  have  access  to  audio  libraries 
equipped  with  listening  carrels  could  result  in  considerable  savings  of  student 
learning  time,  tape  usage,  and  tape  storage,  while  maximizing  use  of  the  carrels. 

If  it  can  be  established  that  the  use  of  compressed  speech  is  both 
educationally  sound  and  academically  feasible,  a  significant  contribution  in  the 
alleviation  of  the  problem  of  assimilation  of  greater  amounts  of  verbal  information 
within  a  given  period  of  time  will  have  been  made. 

Statement  of  the  Problem 


The  problem  of  this  study  was  to  determine  1)  whether  or  not  the  use  of 
various  compression  rates  of  recorded  audio  information  in  a  college  level  course 
significantly  effected  achievement  as  measured  by  objective  examinations,  2)  the 
extent  to  which  learner  selection  of  the  compression  rate  influences  the  level  of 
comprehension/achievement  as  measured  by  the  same  tests,  and  3)  the  degree  of 
student  satisfaction  with  learning  via  compressed  speech,  as  measured  by  an  af- 
fective questionnaire. 

Purpose  of  the  Study 

This  investigation  was  conducted  1)  to  determine  if  the  application  of 
compressed  speech  recordings  in  a  learning  environment  is  a  procedure  which  is 
academically  sound,  2)  to  assess  student  preference  in  regard  to  this  medium, 
and  3)  to  collect  data  which  could  serve  as  the  basis  for  decisions  regarding  the 
academic  practicability  of  compressed  speech  as  an  additional  tool  of  learning. 


278 


Hypotheses 
The  following  five  null  hypotheses  were  tested  in  this  study: 

1.  Students  who  learn  from  compressed  recordings  will  not 
score  significantly  different  on  end  of  course  examinations 

than  students  who  learn  the  same  material  from  non-compressed 
recordings. 

2.  Students  who  can  select  their  own  rate  of  compressed  speech 
for  each  of  several  audio-tutorial  units  will  not  score  signifi- 
cantly different  on  end  of  course  examination  from  students  who 
are  assigned  one  rate  of  compressed  recordings  for  all  units. 

3.  There  will  be  no  significant  difference  in  attitude  toward 
compressed  speech  between  students  who  select  their  own  rate 
of  compression  and  students  who  are  assigned  a  fixed  rate  of 
compressed  recordings. 

4.  There  will  be  no  significant  interaction  between  entering 
grade  point  averages  and  achievement  scores  with  different 
rates  of  compressed  recordings. 

5.  There  will  be  no  significant  interaction  between  the  amount 
of  time  spent  in  audio  learning  activities  in  which  different 
rates  of  compressed  recordings  are  used,  and  scores  achieved 
on  end  of  course  examinations. 

Additional  Research  Questions 

The  following  eleven  additional  questions  are  secondary  to  those  cited 
earlier.    They  deal  not  with  achievement,  and  the  cognitive  domain,  but  rather 
with  student  satisfaction,  and  consequently,  the  affective  domain.    They  are  listed 
separately,  and  in  this  order,  to  facilitate  comparison  with  the  results  of  the 
affective  questionnaire: 

1.  Will  students  express  a  desire  to  use  compressed  speech 
in  other  college  level  courses  ? 

2.  Will  the  students  who  listen  to  compressed  speech  perceive 
it  as  impeding  learning  by  raising  their  level  of  anxiety  ? 

3.  Will  students  feel  that  they  must  control  the  compression 
rate  in  order  to  maximize  learning  with  compressed  speech? 


279 


4.  Will  listening  to  compressed  speech  be  perceived  as  more 
fatiguing  than  listening  to  normal  rate  speech  ? 

5.  Will  the  use  of  filmstrips  in  conjunction  with  compressed  speech 
be  perceived  as  detracting  from  the  effectiveness  of  compressed 
speech? 

6.  Will  students  express  a  preference  for  a  particular  rate  of 
compression? 

7.  Will  students  who  exercise  control  over  the  rate  of  delivery  of 
information,  on  the  average,  tend  to  select  a  common  rate  of 
compression? 

8.  Does  the  use  of  compressed  speech  involve  serious  administra- 
tive or  equipment  problems  which  would  make  its  use  academically 
impractical? 

9.  Will  students  who  have  completed  a  course  which  uses  compressed 
speech  feel  that,  with  practice,  a  person  can  advance  to  increasingly 
higher  rates  of  compressed  speech? 

10.  Will  students  who  have  completed  a  course  which  utilizes  compressed 
speech  feel  that  the  state  of  the  art  is  sufficiently  developed  to  be 
adopted  as  a  common  educational  practice? 

11.  Will  students  who  have  completed  a  course  which  utilizes  compressed 
speech  indicate  that  excessive  replaying  of  compressed  speech  record- 
ings is  necessary  for  satisfactory  understanding? 


Population  and  Sample 

The  population  from  which  the  sample  was  drawn  consisted  of  college 
juniors  and  seniors  enrolled  in  a  required  basic  audiovisual  course.  A  table  of 
random  numbers  was  used  to  assign  the  96  students  to  one  of  the  following  four 
groups: 

Group  I  — Control  Group,  normal  rate  speech;  WPM  120. 

Group  II  — Exp.  Gp.   (E-,),  30%  compression;  WPM  174. 

Group  in  —Exp.  Gp.   (E„),  40%  compression;  WPM  200. 

Group  IV  — Exp.  Gp.  (Eo),  compression  factor  at  students  discretion. 


280 


After  the  students  were  assigned  to  the  experimental  groups,  the  mean 
GPA  (Grade  Point  Average)  for  each  group  was  computed  to  obtain  a  measure  of 
homogeneity  between  groups.    The  results  of  that  computation  was: 

Group  I         Group  II         Group  III         Group  IV         Grand  Mean 
Mean 
GPA  2.74  2.88  2.70  2.83  X  =  2. 79 


METHODS  AND  PROCEDURES 

The  Environment  of  the  Experiment 

The  recordings  used  constituted  an  automated,  self-instructional 
portion  of  the  course  "Media  and  Technology  in  Teaching, "    and  covered  historical 
development,  materials  production  techniques,  and  current  utilization  procedures 
for  audio-visual  materials  and  equipment.     There  were  16  recorded  lectures  (units) 
with  accompanying  35mm  film  strips.    All  recordings  were  made  by  the  same 
instructor,  whose  mean  reading  rate  was  120  WPM.    That  this  is  below  the  mean 
found  by  Foulke  is  accounted  for  by  two  factors;    1)  the  material  of  this  course  is 
consistently  more  technical  in  nature  than  the  material  used  by  Foulke' s  readers, 
and  2)  the  present  reading  is  interspersed  with  accompanying  film  strips.    It 
should  be  noted  that  Short's  study  used  recordings  which  averaged  110  WPM  before 
compression.    Short  accounted  for  this  rate  for  much  the  same  reasons  as  given 
in  2  above  plus  the  attempt  to  keep  a  more  conversational,  one  to  one  relationship 
built  into  the  audio-tutorial  learning  environment. 

Individual  carrels  and  individual  headsets  were  not  used  in  the  Watts 
experiment.    The  implied  suggestion  contained  in  the  following  statement  by  Watts 
about  his  study,  was  heeded  in  the  present  design: 

It  is  suspected,  though,  that  student  attentiveness    may 
have  been  greater  if  the  experience  had  been  more 
"Privatized"  using  earphones  and  study  carrels  to  re- 
duce distraction. 


U.  S.   Department  of  Health,  Education  and  Welfare,  The  Comprehension 
of  Rapid  Speech  by  the  Blind,  by  Emerson  Foulke,  Project  No.  2430  Part  IE 
(Washington,  D.  C. :  Office  of  Education,  1967). 

o 
Sarah  H.   Short,  "The  Use  of  Compressed  Speech  Tapes  in  a  Multi- 
Mediated  Learning  Laboratory. "    (Paper  presented  at  the  annual  convention  of  the 
Association  for  Educational  Communications  &  Technology,  Minneapolis,  1972). 

3  Meridith  W.  Watts  Jr. ,  Using  Compressed  Speech  to  Teach  Instructional 
Techniques  to  Air  Force  Officers,  Department  of  Instructional  Technology  Report 
(Alabama:  Maxwell  Air  Force  Base,  1969). 

281 


All  subjects  listened  to  the  recordings  while  viewing  the  filmstrips  in  individual 
listen/viewing  carrels.    The  subjects  in  the  present  experiment  wore  headsets 
and  had  individual  control  over  the  volume  of  the  recording  to  which  they  were 
listening;  however,  they  could  not  adjust  the  tone. 

To  control  the  variable  of  audio  equipment  performance,  all  tape 
decks  in  the  carrels  were  serviced  prior  to  the  start  of  the  experiment.    This 
included  the  standardization  of  each  position's  playback  heads  with  a  Nortronics 
AT-100  azimuth  and  amplifier  alignment  tape. 

Of  the  total  number  of  subjects  participating  in  the  experiment,  74% 
equally  and  randomly  distributed  among  the  four  groups,  were  given  a  hearing 
test  on  a  calibrated  Beltone  model  IOC.    Specific  frequencies  checked  were 
125,  250,  500,  1,000,  2,000,  4,000,  and  8,000  Hz,  over  a  DB  range  from  -10 
to  +50.    No  significant  hearing  losses  were  detected.    The  hearing  test  was 
administered  to  establish  the  subject's  ability  to  hear  the  audio  tone  which  prompted 
them  to  advance  the  filmstrip  viewer  to  the  next  frame.    This  procedure  was 
incorporated  because  it  was  felt  that  the  student  who  did  not  keep  the  audio  and 
visual  information  in  synchronization  would  quickly  become  frustrated  and  thereby 
develop  a  negative  attitude  toward  the  entire  audio-tutorial  procedure. 

The  listening/viewing  carrels  were  arranged  in  two  rows  of  six  and 
one  row  of  nine  in  a  standard  language  lab  configuration.    Subjects  in  the  carrels 
all  faced  forward  toward  the  control  console.    The  carrels  had  opaque  acoustical 
separating  panels  on  both  sides  and  a  clear  plexiglass  panel  at  the  front  which 
allowed  observation  of  the  subjects  from  the  master  console.    The  carrels  were 
located  in  an  enclosed,  quiet  corner  in  the  rear  of  an  instructional  media  center. 
Entry  to  the  area  was  from  a  single  doorway  in  the  rear,  so  that  there  was  little 
distraction  from  people  passing  the  area.    Since  the  area  was  located  on  the  second 
floor,  the  windows  along  one  wall  offered  no  distraction  from  passers-by. 

The  audio  tape  recordings  which  the  subjects  were  directed  to  use  were 
located  on  separate  book  shelves  in  the  rear  of  the  carrel  area,  and  labeled  with 
the  group  numbers:  I,  n,  HI,  or  IV.    Subjects  in  group  IV  were  allowed  to  select 
tapes  from  the  group  IV  shelves  which  had  compression  rates  of  20%,  25%,  50% 
and  55%,  or  from  the  group  II  (30%)  or  group  IE  (40%)  shelves.    They  were  directed 
not  to  use  group  I  (no  compression)  tapes. 4 

All  subjects  received  an  orientation  on  the  operation  of  carrel  equipment, 
the  time  sheet,  and  the  selection  and  reshelving  of  both  recordings  and  filmstrips. 
An  eight  step  procedures  sheet,  plus  a  sample  completed  time  sheet  was  posted  in 
each  carrel.    The  subjects  recorded  the  beginning  and  ending  time  of  each  listening 
activity  on  prepared  time  sheets,  which  were  turned  in  upon  completion  of  each  unit. 


4  All  tapes  were  compressed  using  PKM  Corporation's  VOCOM  I  Speech 
Compressor/Expander. 


282 


Selections  of  rates  of  compression  by  subjects  in  group  IV  were  made 
matter  of  record  by  the  student,  who  entered  this  information  on  the  time  sheet. 

The  subjects  were  periodically  spot  checked  while  they  were  in  the 
carrel  area  to  insure  that  they  selected  tapes  from  and  returned  tapes  to  the 
proper  shelf,  that  the  selected  unit  (and  for  Group  IV  subjects  the  compression 
rate)  was  being  recorded  correctly,  and  that  the  time  entries  were  completed  in 
the  manner  directed.    All  time  sheets  were  examined  at  the  end  of  each  class 
day  to  ensure  accuracy  and  completeness.    Whenever  errors  were  found,  the 
student  was  asked  to  make  the  necessary  corrections. 


Choice  of  Experimental  Design 


Variables 


The  variables  involved  in  this  study  were: 


Dependent  Variable 
Dependent  Variable 

Independent  Variable 


-Achievement.     Determined  by  scores 
obtained  on  posttests. 

-Time  spent  in  learning.    Determined  by 
subtracting  the  running  time  of  the 
recorded  units  from  the  time  consumed 
by  subjects,  and  comparing  the  resultant 
between  groups. 

-Rate  of  delivery  of  information.    Controlled 
by  the  assignment  of  given  rates  of  compressed 
speech  to  selected  groupings  of  subjects. 


Assignment  of  subjects  to  groups 

Working  with  the  class  roster  and  a  random  number  table,  the  students 
were  assigned  to  the  following  groups,  24  subjects  per  group:  (N=96) 


Group  I 
Group  II 
Group  in 
Group  IV 


■Control  group.    Normal  rate  speech. 


—  E-, 


--E. 


Compression 


— E2«      Compression 


30%. 
40%. 


Compression  rate  selected  by  subjects. 
Choice  of  20%,  25%,  30%,  40%,  50%  or  55%. 


283 


Experimental  design  paradigm 

The  following  experimental  design  paradigm  was  selected: 

R  X1  O^  — Xj   =   Normal  rate  speech. 

R  X2  Ox  — X2   =   30%  Compression. 

R  X3  01  — X      =   40%  Compression. 

R  X4  CL  — X4   =    Compression  rate  selected  by  subjects. 

Statistical  Design 

A  simple  one-way  analysis  of  variance  (ANOVA)  was  first  conducted 
to  determine  the  effects  of  the  independent  variable  of  compression  rate  on 
achievement,  using  the  following  paradigm: 

COMPRESSION  RATE 


ACHIEVEMENT 


in 


IV 


The  GPA  (grade  point  average)  for  each  student  was  obtained,  and  a 
two-way  ANOVA  was  used  to  determine  if  there  was  significant  differences  in 
achievement  over  various  compression  rates,  and  if  interaction  occurred  between 
GPA  and  the  independent  variable  of  compression  rate.    The  following  paradigm 
was  used: 

COMPRESSION  RATE 


GPA 


HI 
LO 


m 


rv 


Another  two-way  ANOVA  was  conducted  to  determine  if  there  was 
significant  differences  in  achievement,  over  various  compression  rates,  between 
high  achievers  and  low  achievers,  and  if  there  would  be  interaction.    The  following 
paradigm  was  used: 


HI 
ACHIEVEMENT 
LO 


COMPRESSION  RATE 

I 

ii              m 

IV 

284 


A  third  two-way  ANOVA  was  used  to  determine  if  there  was  significant 
interaction  between  time  spent  in  learning,  over  various  compression  rates,  and 
the  independent  variable  of  achievement.    This  ANOVA  used  the  following  paradigm: 

COMPRESSION  RATE 


TIME  SPENT  MOST 

IN 
LEARNING  LEAST 


HI  IV 


Instruments  Used 


The  instruments  used  in  this  study  were:  1)  a  78  item  multiple  choice 
midterm  examination,  2)  a  54  item  multiple  choice  final  examination,  and  3)  a  29 
item  attitude  questionnaire. 

RESULTS 

Results  of  testing  hypothesis  one  (1)  and  two  (2) 

The  maximum  average  score  attainable  for  the  combined  midterm  and 
final  exam  was  66.    For  the  purpose  of  this  experiment  a  raw  score  mean  difference 
of  five  points  was  chosen  by  the  investigator  as  the  threshold  indication  of  a  practical 
difference. 

TABLE  1.  —  Group  means  and  standard  deviation 


Group 

I 

n 
m 
rv 

The  mean  for  subjects  using  normal  rate  speech  (Group  I)  was  48.15 
while  subjects  who  listened  to  compressed  speech  achieved  a  mean  score  of  46.  70, 
a  difference  of  1.45  raw  grade  points.    Those  subjects  assigned  to  a  fixed  rate  of 
compressed  speech  (Group  II  plus  Group  HI)  achieved  a  mean  score  of  46.  79  while 
subjects  who  selected  their  own  rate  of  compression  achieved  a  mean  score  of  46.  33. 
This  latter  figure  represents  a  raw  grade  point  difference  of  .  46.    Such  small 
practical  differences  lent  support  to  both  hypothesis  one  and  two. 

To  provide  a  statistical  basis  for  conclusions,  a  simple  one-way  analysis 
of  variance  was  computed  between  the  means  of  all  groups.    Alpha  was  set  at  .  25 

285 


npression 

Standard 

Grand 

Rate 

X 

Deviation 

Mean 

0% 

48.15 

6.70 

47.02 

30% 

47.52 

6.55 

40% 

46.06 

5.27 

10%-55% 

46.33 

5.51 

to  minimize  the  risk  of  a  type  II  error.    No  significant  differences  were  found. 
Table  2  shows  the  result  of  that  ANOVA. 

TABLE  2.  —  One-way  ANOVA  between  all  groups 


Variance 


df 


MS 


Level  of 
Significance 


Between  Groups 
Within  Groups 
Total 


3 
92 
95 


46.59 
35.72 


1.30 


NSD 


oC        =  .25 
df3,92         =1.40 


Null  hypotheses  one  and  two  were  retained. 
Results  of  testing  hypothesis  three  (3) 

Attitude  toward  compressed  speech  as  a  tool  of  learning  was  assessed 
through  tabulation  of  subjects'  responses  to  those  items  of  the  questionnaire  which 
addressed  hypothesis  three.    Since  there  was  a  choice  of  negative  or  positive  re- 
sponses to  each  item,  the  total  value  for  an  individual  subject  could  range  from 
-46  to  +46.    Table  3  shows  the  resulting  measure  of  attitude  obtained  from  the 
questionnaire.    The  numerical  value  was  derived  by  totalling  the  responses  of  all 
subjects  within  a  group  to  the  pertinent  questionnaire  items.    SD  (strongly  disagree), 
D  (disagree),  U  (undecided),  A  (agree),  and  SA  (strongly  agree),  were  scored 
-2,  -1,  0,  +1,  and  +2  respectively. 

TABLE  3.    — Attitude  toward  compressed  speech 


Compression 

No.  of 

Mean 

Nr. 

of 

Neg. 

Group 

Rate 

Value 

R  espondents 

Val 

ue 

Responses 

I 

0% 

+202 

16 

+12. 

62 

2 

Individual 

n 

30% 

+289 

14 

+20. 

64 

0 

Range 

m 

40% 

+217 

12 

+18. 

08 

1 

-46  to 

IV 

20 

-55% 

+263 

13 

+21. 

23 

2 

+46 

x  = 

+242. 

75 

13.75 

+17. 

89 

2. 

5 

Of  the  subjects  responding  to  these  specific  items  on  the  questionnaire, 
91%  expressed  a  favorable  attitude  toward  compressed  speech  as  a  primary  mode 
for  learning  subject  matter.  Those  subjects  who  selected  their  own  rates  of  com- 
pressed speech  (Group  IV)  averaged  .  87  of  a  point  higher  on  the  attitude  question- 
naire than  subjects  who  were  assigned  to  fixed  rates  of  compressed  speech  (Groups 
II  and  IE).    Because  of  this  low  obtained  value,  and  because  the  actual  difference 


286 


in  mean  value  between  subjects  assigned  to  30%  and  subjects  assigned  to  40% 
compression  rates  was  itself  2.  56,  no  statistical  test  was  employed  to  test 
hypothesis  three.    Null  hypothesis  three  was  retained. 

Results  of  testing  hypothesis  four  (4)  ' 

Testing  of  the  fourth  hypothesis  was  accomplished  by  a  two-way  analysis 
of  variance  using  rate  of  compression  and  GPA  (grade  point  average)  as  main 
effects.    For  this  analysis,  normal  rate  speech  was  considered  as  having  a 
compression  rate  of  zero.    Subjects  in  each  of  the  four  groups  were  ranked 
according  to  their  GPA  and  the  top  one  third  w  as  designated  HI  (GPA)  and  the 
bottom  one  third  was  designated  LO  (GPA).    The  achievement  means  of  these 
divisions,  for  the  control  group  and  each  of  the  experimental  groups  is  displayed 
in  Table  4.    Figure  1  shows  an  interaction  plot  of  the  mean  scores  of  the  LO  and 
HI  GPA  students  over  the  various  compression  rates. 

TABLE  4.  —  Means  of  HI  GPA  and  LO  GPA  over  various  rates  of  compressed  speech 


I 

n 

in 

IV 

Mean 

GPA 

HI 

51.00 

50.81 

50.69 

50.56 

50.77 

LO 
X 

42.00 

43.69 

42.31 

44.31 

43.08 

46.50 

47.25 

46.50 

47.44 

46.93 

51.00 

50.50 

/*&* 

44.00 

|V« — 

43.50 

43.00 

ii  *^ 

42.50 

III   «*y 

42.00 

LO 

GPA 

HI  GPA 

Fig.  1.  —  Interaction  plot  of  HI  GPA  and  LO  GPA  achievement  means. 

Although  Figure  1  does  show  interaction  to  be  present,  the  results  of  a 
two-way  ANOVA,  with  compression  and  GPAas  main  effects,  and  achievement  as 
the  dependent  variable,  indicates  such  interaction  is  not  significant.  (See  Table  5. ) 

TABLE  5.  — Two  way  ANOVA  comparing  achievement  over  various  compression 
rates  with  GPA 

Source  of  Variance SS df MS IT Level  of  Significance 

Between  Columns  (E ate)    11.67 

Between  Rows  (GPA)         945.56 

Columns  by  Rows  9.  29 

(interaction) 

287 


3 

3.89 

.13 

NSD 

1 

954.  56 

32.36 

.05 

3 

3.10 

.11 

NSD 

Source  of  Variance SS df MS  F  Level  of  Significance 

Betw  een  Groups  975.54  7  139.36 

Within  Groups  1,652.07  56  29.50 

Total 2,627.61  63 


<*-    =  .05  df  3,56  =  2.  78 
Null  hypothesis  four  was  retained. 

Results  of  testing  hypothesis  five  (5) 

In  addressing  the  fifth  hypotheses,  three  questions  were  asked:    1)  Did 
those  subjects  who  achieved  the  highest  scores  for  a  given  rate  of  compression 
spend  consistently  more  time  in  learning  activities  than  subjects  who  achieved  the 
lowest  scores  with  that  same  rate?    2)  did  subjects  who  spent  more  time  with  a 
given  compression  rate  achieve  significantly  higher  scores  than  subjects  who  spend 
less  time  with  that  same  rate?  and  3)  was  there  interaction  between  achievement 
and  time  spent,  with  different  compression  rates  ? 

To  answer  the  first  question,  subjects  in  each  group  were  ranked 
according  to  the  average  of  their  combined  scores  on  the  midterm  and  final  examina- 
tion.   The  top  one  third  in  each  group  was  designated  as  HI  achievers  and  the  bottom 
one  third  was  designated  as  LO  achievers.    Table  6  shows  a  comparison  of  the  time 
spent  by  each  group  in  listening,  with  the  corresponding  scores  obtained. 

It  is  apparent  from  inspection  of  Table  6  that  the  HI  achievers  in  the 
control  group  (I)  and  in  each  of  the  experimental  groups  did  spend  consistently 
more  time  with  the  audio-tutorial  units  than  did  the  LO  achievers.    Accordingly, 
the  answer  to  the  first  question  was  yes. 

TABLE  6.  — Time  versus  achievement  of  HI  and  LO  achievers 

HI  ACH                                   LO  ACH 
Group  Mean  Time        Score        Mean  Time         Score d(hr) 


I 

8:37 

54.94 

7:43 

40.06 

:54 

II 

8:18 

54.88 

6:35 

40.31 

1:43 

m 

6:26 

51.93 

5:14 

40.69 

1:12 

IV 

8:22 

52.38 

6:22 

40.19 

2:00 

X 7^56 53.53 6^29 40.31  1.27 

To  answer  the  second  question,  subjects  in  the  control  group  and  in  the 
experimental  groups  were  ranked  from  the  most  time  spent  to  the  least  time  spent. 
The  top  one  third  was  designated  as  MOST  time  spenders,  and  the  subjects  in  the 
bottom  one  third  were  designated  as  LEAST  time  spenders.    The  achievement  means 


288 


of  these  divisions,  for  the  control  group  and  the  experimental  groups  is  displayed 
in  Table  7. 

TABLE  7.  — Means  of  MOST  and  LEAST  time  spenders  over  various  rates  of 
compressed  speech 


I 

II 

in 

IV 

Mean 

MOST 
LEAST 

48.69 
47.25 

48.31 
46.69 

46.81 
44.81 

47.81 
43.31 

47.91 
45.52 

X 

47.97 

47.50 

45.81 

45.56 

46.71 

Although  Table  7,  like  Table  6,  shows  that  subjects  who  spent  more  time 
did  obtain  higher  scores,  the  results  of  a  two-way  ANOVA  using  compression 
rate  and  time  spent  as  main  effects,  and  achievement  as  the  dependent  variable, 
revealed  that  the  differences  in  the  scores  was  not  significant.     (See  Table  8. ) 
Thus,  the  answer  to  the  second  question  was  negative. 

TABLE  8.  — Two-way  ANOVA  comparing  achievement  over  various  compression 
rates  between  MOST  and  LEAST  time  spenders 


Source  of  Variance 

SS 

df 

MS 

F 

Level  of  Significai 

Between  Columns 

(rate) 

36.12 

3 

12.04 

.30 

NSD 

Between  Rows  (Time)  53.07 

1 

53.07 

1.31 

NSD 

Columns  by  Rows 

(interaction) 

93.83 

3 

31.28 

.77 

NSD 

Between  Groups 

183.20 

7 

26.17 

Within  Groups 

2,274.29 

56 

40.61 

Total 

2,457.31 

63 

(*  =  .05     df     3,56  =  2.78 

To  answer  the  third  question  a  two-way  ANOVA  was  computed  with  the 
achievement  means  of  the  HI  achievers  and  LO  achievers  listed  in  Table  6.    The 
results  of  this  ANOVA,  using  Rate,  and  HI  and  LO  achievers  as  main  effects,  is 
displayed  in  Table  9.    No  significant  difference  was  apparent  between  the  various 
compression  rates,  and  no  interaction  effect  was  detected.       Accordingly,  the 
answer  to  the  third  question  was  negative. 


289 


TABLE  9  — Two-way  ANOVA  comparing  achievement  over  various  compression 
rates  between  HI  and  LO  achievers 


Source  of  Variance          SS 

df 

MS 

F 

Level  of  Signific 

Between  Columns 

(rate)                            25. 73 

3 

8.58 

.35 

NSD 

Between  Rows  (HI 

vs  LO  ACH)         2,  802.  38 

1 

2,802.38 

114. 57 

.01 

Columns  by  Rows 

(interaction)              37. 58 

3 

12.53 

.51 

NSD 

Betveen  Groups       2,  865.  69 

7 

409.38 

Within  Groups          1,  369.  59 

56 

24.46 

Total                           4,235.28 

63 

e*-     =.05     df     3,56  =  2.78 

Null  hypothesis  five  was  retained. 

Additional  results  of  analysis 

A  comparison  of  the  subjects'  listening  time  with  the  actual  running  time  of 
the  normal  rate  and  compressed  audio -tutorial  units  was  made  to  answer  the  question, 
"Would  subjects  who  listened  to  compressed  speech  negate  the  time  saving  potential 
by  'execessive'  replaying?" 

The  total  uninterrupted  running  time  of  the  units  for  each  rate,  as  shown  in 
column  four  was  subtracted  from  the  mean  total  time  recorded  by  subjects  in  the 
control  group  and  in  the  first  two  experimental  groups.  The  results  of  that  com- 
parison are  shown  in  Table  10. 

TABLE  10.  — Time  comparisons  of  normal  rate  and  compressed  speech 


Compression 

Listening 

Running 

Replay /Note- 

Listening  Time 

X 

Group 

Rate 

Time 

Time 

Taking  Time 

Saved 

ACH 

I 

0% 

8:32 

5:42 

2:50 

0 

48.15 

n 

30% 

7:06 

3:55 

3:11 

1:26 

47.52 

ni 

40% 

5:51 

3:21 

2:30 

2:41 

46.06 

Subjects  listening  to  30%  compression  spent  an  average  of  21  minutes  more  in 
replaying/notetaking  activities  than  subjects  listening  to  normal  rate  speech,  while 
subjects  listening  to  40%  compression  spent  an  average  of  20  minutes  less  in  replay- 
ing/notetaking activities  than  subjects  listening  to  normal  rate  speech.    It  is  apparent 


290 


that  replaying  time  does  not  negate  the  time  saving  potential  of  compressed 
speech  since  subjects  in  the  30%  group  spent  one  hour  and  26  minutes  less  time 
in  completing  all  units  than  the  normal  rate  group,  while  subjects  in  the  40%  group 
spent  two  hours  and  41  minutes  less  time  in  completing  all  units  than  the  normal 
rate  group.    Accordingly,  the  subjects  who  listened  to  units  compressed  by  30% 
realized  an  average  time  saving  of  17%  while  subjects  listening  to  units  compressed 
by  40%  realized  an  average  time  saving  of  31%. 

Results  of  the  Questionnaire 
Tabulation  of  student  responses  resulted  in  the  following  findings: 

1.  Ninety-one  per  cent  expressed  a  favorable  attitude  toward  compressed 
speech  as  a  primary  mode  for  learning  subject  matter. 

2.  Ninety-four  per  cent  expressed  a  desire  to  take  other  college  courses 
utilizing  compressed  speech. 

3.  One  hundred  per  cent  indicated  they  would  use  compressed  speech  for 
reviewing  subject  matter. 

4.  In  response  to  what  rate  they  felt  was  most  comfortable,  9%  chose  normal 
rate  speech;  20%  chose  20%  compression;  34%  chose  30%  compression; 
and  37%  indicated  they  would  chose  40%  or  higher  rates  of  compression. 

5.  Ninety  per  cent  felt  that  with  practice,  increasingly  higher  rates  of  com- 
pression could  be  used. 

6.  Ninety-seven  per  cent  felt  that  learner  control  over  the  rate  of  compression 
was  necessary  or  desirable  for  a  most  satisfactory  learning  experience. 

7.  Twenty-five  per  cent  felt  that  learning  from  compressed  speech  recordings 
was  more  tiring  than  normal  rate  speech. 

8.  Sixteen  per  cent  felt  that  learning  from  compressed  speech  was  more  anxiety 
producing  than  normal  rate  speech. 

9.  Thirteen  per  cent  felt  that  combining  compressed  speech  and  filmstrips  made 
the  learning  experience  more  difficult. 

10.  Nineteen  percent  felt  that  technical  improvement  in  compressed  speech  was 
needed  before  it  could  be  implemented  as  a  common  means  of  learning 
subject  matter. 

11.  Forty-six  per  cent  felt  that  continual  replaying  of  parts  of  the  compressed  speech 
recording  was  necessary  for  satisfactory  learning. 

Conclusions 

The  following  conclusions  are  drawn  from  the  analysis  and  evaluation  of  those 
data  resulting  exclusively  from  the  present  investigation: 

1.  College  Junior  and  Senior  students  can  learn  the  type  of  cognitive  matter 
presented  in  this  course  at  least  as  well  via  compressed  speech  as  by  normal 
rate  recordings. 

2.  Less  total  time  is  spent  in  audio-tutorial  learning  of  the  nature  provided  in 
this  investigation  when  compressed  speech  is  used  in  lieu  of  normal  rate 
recordings. 

3.  Audio-tutorial  units  of  the  nature  used  in  this  investigation,  presented  via 
compressed  speech,  allow  for  more  engagement  with  content,  i.  e.  replay 
time,  than  non-compressed  recordings  during  a  given  period  of  time.  291 


4.  The  nature  of  compressed  speech  per  se,  does  not  engender  excessive 
replaying. 

5.  The  greatest  majority  of  students  express  a  positive  attitude  toward  learning 
cognitive  material  of  the  nature  used  in  this  investigation,  via  compressed 
speech. 

6.  Grade  point  averages  are  a  valid  predictor  for  success  in  audio-tutorial 
learning  of  the  nature  and  level  provided  in  this  investigation. 

7.  Compressed  speech  can  be  considered  as  a  satisfactory  alternate  mode 
of  learning. 

8.  Compressed  speech,  used  in  an  independent  study  environment,  provides 
an  additional  measure  of  individualization  regarding  time. 

9.  No  administrative  or  equipment  problems  arise  solely  due  to  the  use  of 
compressed  speech  in  an  audio-tutorial  learning  environment  structured 
to  provide  cognitive  information  of  the  nature  used  in  this  investigation. 

10.      Compressed  speech  is  an  academically  practical  tool  of  learning  in  an  audio- 
tutorial  learning  environment  structured  to  provide  cognitive  information 
of  the  nature  used  in  this  investigation. 

R  ecommendations 

On  the  basis  of  those  data  obtained  from  this  investigation,  the  following 
recommendations  are  offered: 

1.  That  institutions  involved  in  educational  and  training  activities  give  serious 
consideration  to  the  application  of  compressed  speech  in  their  programs. 

2.  That  organizations  possessing  audio  tape  libraries  consider  processing 
holdings  at  one  or  more  compression  rates  and/or,  make  available  speech 
compression/ expansion  equipment  which  allows  for  learner  control  over  the 
rate  of  compression  while  listening. 


NOTE: 

For  those  wishing  greater  detail  on  this  research,  for  purposes  of  replication, 
see  ERIC    No.   ED  075  995. 


Dr.  Challis  is  a  Media  Specialist  (Instructional  Development)  Audio  Visual  Services, 
and  Assistant  Professor  of  Educational  Media,  Miami  University,  Oxford,  Ohio  45056. 


292 


A  Comparative  Investigation  of  Listening  Rate  Preference 
Employing  Two  Methods  of  Temporal  Alteration 
by  Leeper,    H.  A.  ,    &  Lass,    N.J. 


ABSTRACT 

A  COMPARATIVE  INVESTIGATION  OF  LISTENING  RATE  PREFERENCE 
EMPLOYING  TWO  METHODS  OF  TEMPORAL  ALTERATION 

A  paired  comparison  procedure  was  employed  to  compare 
listening  rate  preferences  of  20  adult  female  subjects. 
Recordings  of  "The  Rainbow  Passage"  (Fairbanks,  1060)  were 


?  ~> 


:.  D 


time-altered  to  yield  seven  rates:   150,  175,  200 
250,  275,  300  words  per  minute.   Two  separate  time  alterations 
were  completed  on  the  unaltered  prose  passage.   One  pro- 
cedure (VOCOM-I)  altered  the  passage  by  a  selective  vowel 
compression  and  pause  deletion  technique,  while  the  other 
method  (Varispeech  I)  altered  the  passage  by  a  random 
expansion/deletion  process.   Two  master  tapes  were  constructed, 
one  for  each  of  the  two  time  alteration  techniques.   The  tapes 
were  presented  to  the  subjects  1  1/2  months  apart.   The 
results  of  the  subjects'  evaluations  indicated  different 
listening  rate  preferences  for  the  two  methods  of  time 
alteration.   The  most  preferred  rate  for  the  vowel  compression 
and  pause  deletion  technique  was  200  wpm,  and  the  least 
preferred  was  150  wpm.   The  most  preferred  rate  for  the 
systematic  discard/expansion  procedure  was  225  wpm,  and  the 
least  preferred  was  300  wpm.   Comparison  of  the  present 
results  with  past  research  suggests  that  the  original 
unaltered  speaking  rates,  as  well  as  the  alteration  of  specific 
portions  of  the  prose  passage,  may  explain  differences  in 
listening  rate  preference  for  normal  adults. 


293 


The  development  of  specialized  electronic  equipment 
which  allows  for  the  control  of  the  rate  of  presentation  of 
recorded  speech  without  serious  frequency  alteration  has 
fostered  numerous  studies  of  listener  rate  preference 
(Hutton,  1954;  Foulke  and  Sticht,  1966;  Lass  and  Prater, 
1073;  Cain  and  Lass,  1074).   Hutton  (1954),  in  a  study  of 
listening  rate  preference,  had  50  subjects  listen  to  40 
versions  of  a  standard  reading  passage  which  ranged  in 
rate  from  77.5  to  412.5  words  per  minute.   The  samples 
were  noted  on  a  nine-point  scale  with  upper  and  lower 
limits  called  "superior"  and  "inferior,"  respectively. 
Hutton  reported  that  his  subjects  preferred  a  speaking 
rate  equal  to  163  words  per  minute. 

In  another  study  of  listening  rate  preference,  Foulke 
and  Sticht  (1966)  had  100  college-age  students  judge 
listening  material  of  "moderate"  difficulty.   Using  a 
method  of  limits  paradigm,  the  experimenters  found  that 
by  allowing  the  students  to  manipulate  the  rate,  a 
preferred  mean  listening  rate  for  the  group  was  207 
\\fords  per  minute. 

Cain  and  Lass  (1074)  ,  who  employed  a  slightly  different 
psychophysical  method  of  rate  preference  (paired  comparison) , 
indicated  that  when  listening  to  a  standard  prose  passage 
(The  Rainbow  Passage)  ,  college  students  most  preferred  rates 
of  presentation  of  175  words  per  minute,  and  they  least 
preferred  rates  of  100  and  300  words  per  minute.   Furthermore, 
Lass  and  Prater  (1973)  ,  using  a  similar  paired  comparison 
method  for  determining  listener  rate  preferences  for  oral 
reading  and  impromptu  speaking,  found  175  words  per  minute 
as  the  most  preferred  rate  for  both  tasks.   The  least 
preferred  rate  for  each  task  was  100  words  per  minute. 

To  date,  several  of  the  studies  of  rate  preference 
(Lass  and  Prater,  1973;  Lass  and  Fultz,  1976)  have  employed 
compressors  that  randomly  discard  and  add  time  features  to 
the  incoming  signal  (Whirling  Dervish;  Varispeech  I). 
These  techniques  employ  a  random  discard  sampling  time  of  60 
milliseconds  or  less  for  satisfactory  intelligibility  of 
the  compressed  signal. 

Another  available  technique  is  based  on  a  pause 
deletion  or  vowel  compression  mode  of  operation.   This 
entails  the  use  of  a  fast-acting  clutch  to  stop  the  tape 
recorder  for  the  duration  of  the  suppressed  pauses  or 
vowels  and  to  restart  again  upon  the  presence  of  shorter 
signals  (VOCOM-I). 


294 


Since  commercial  availability  and  cost  of  effective  speech 
compression  units  is  of  special  importance  to  educators,  and  since 
rate  preference  for  reading  and  spoken  material  is  also  important 
in  today's  educational  systems,  it  seems  important  to  look  for  the 
most  positive  aspects  in  several  methods  of  speech  compression. 
Therefore,  the  purpose  of  our  present  investigation  was  to  compare 
listeners'  rate  preferences  for  two  different  methods  of  speech 
compression/expansion . 


Method 


Subjects: 

Twenty  adult  females  served  as  subjects.   All  were  students  or 
staff  at  Oklahoma  State  University.   None  of  the  subjects  had  a 
past  history  of  speech,  language  or  hearing  difficulties,  and  none 
had  previously  participated  in  a  rate  preference  study.   The  subjects 
ranged  in  age  from  20  to  32  years,  with  a  mean  age  of  22.5  years. 

Recording  Material: 

The  prose  passage,  "The  Rainbow  Passage"  (Fairbanks,  1960)  was 
recorded  by  a  professional  radio  announcer  who  spoke  in  a  General 
American  dialect.   The  passage  was  recorded  in  a  sound-treated  room 
using  a  Magnecord  model  1022  tape  recorder  and  associated  RCA  model 
77DX  microphone. 

The  middle  four  sentences  of  the  passage  (55  words)  were  used 
for  the  time  alteration  procedures.   The  first  and  last  sentences 
were  deleted  in  an  attempt  to  avoid  biasing  effects  of  start-up  and 
slow-down  that  might  affect  listeners'  perception  of  the  overall 
reading  rate. 

Methods  of  Time  Alteration: 

The  original  recording  of  the  reading  v/as  time-altered  by  two 
methods.   The  first  method  (VOCOM-I)  is  a  speech  compression  process 
that  selectively  removes  pauses  and  makes  vowel-type  sounds  shorter 
and  acts  like  a  rapid-acting  voice-actuated  recorder.   Further, 
when  in  the  expansion  mode,  the  process  adds  pauses  of  a  controlled 
duration  whenever  a  normal  pause  is  detected  in  the  speech.   The 
procedure  employs  a  high-speed  start-stop  clutch  in  addition  to 
noise  reject  control  and  the  pause  expand-compress  control. 

The  second  method  by  time  alteration  (Varispeech  I)  employs  a 
periodic  deletion  or  repetition  procedure  to  lengthen  or  shorten 
the  original  recording  time.   Essentially,  this  method  is  an 
electronically  sophisticated  random  access  memory  (RAM)  device  that 
eliminates  the  mechanical  problems  of  the  older  periodic  discard 
apparatus  designed  by  Fairbanks,  Everitt,  and  Jaeger  (1954)  by 
disposing  of  high-speed  rotating  heads  and  slip-rings  and  by 
incorporating  IC  materials,  digital  computer  memory,  A  to  D  and 
D  to  A  converters  and  electronic  filtering  into  the  procedure.   This 


295 


allows  time  compression  and  time  expansion  of  tape  recorded 
materials  without  serious  distortion  to  pitch  and  quality 
of  the  recording. 

The  original  recordings  were  time-altered  by  both 
methods  to  yield  seven  different  rates:   150,  175,  200,  225, 
250,  275,  and  300  words  per  minute. 

Construction  of  the  Master  Tapes; 

A  paired  comparison  procedure  described  by  Guilford 
(1954)  was  employed  for  presentation  of  the  seven  different 
rates  for  preference  evaluation.   Each  master  tape  included 
twenty-one  pairs  (n  (n-1)  /2)  of  the  passage  with  order  of 
presantation  as  well  as  order  of  the  individual  readings  in 
each  pair  established  to  avoid  time  and  space  errors.   (Ross, 
19  34)  .   Each  of  the  seven  rates  appeared  six  times  on  each 
master  tape. 

In  constructing  each  master  tape,  high-quality  Sony 
model  TC  650  tape  recorders  were  employed  to  reproduce  the 
seven  rates  of  time  alteration  for  each  compression  method. 
A  one-second  pause  was  inserted  between  the  two  readings 
in  each  pair  and  a  three-second  pause  was  inserted  between 
each  sample  pair  of  readings.   In  addition,  a  set  of  pre- 
recorded instructions  and  three  practice  trials  were 
included  in  the  presentation  for  familiarity  purposes. 

Each  of  the  20  subjects  participated  in  a  listening 
session  lasting  30  minutes.   Subjects'  preferences  were 
recorded  on  a  separate  sheet  by  circling  one  of  two 
numbered  pairs.   The  tapes  were  presented  binaurally  to 
each  subject  through  Superex,  model  ST-M  cushioned 
earphones  from  a  Sony  model  TC-650  tape  recorder.   One  and 
one-half  months  separated  the  preference  rating  sessions  for 
the  two  methods  of  speech  compression. 


Results 


The  data  in  Slide  1  indicate  the  mean  proportion  of  cases  in 
which  a  given  rate  was  prefereed  when  paired  with  another 
rate  for  each  of  the  seven  different  rates  obtained  from  the 
VOCOIl-I  compressor.   To  obtain  these  proportions,  it  was  necess- 
ary to  construct  tables  for  each  of  the  20  subjects  in  the  study, 
and  to  divide  the  frequency  of  judgments  in  each  rate  category 
by  the  total  number  of  judgments  made.   For  example,  the  value 
0.35  (column  3,  row  6)  indicates  that  85  percent  of  the  20 


296 


EXPERIMENTAL   PROPORTIONS,  SUMMED   PROPORTIONS,  AND   RANK  ORDERINGS 
OF  SEVEN   RATES  BASED  ON  THE   EVALUATIONS  OF  SUBJECTS  LISTENING  TO 
THE  VOCOM   I   PROCESS  OF  TEMPORAL  ALTERATION 


Rates 

(WPM) 

150 

175 

200 

225 

250 

275 

300 

150 

__ 

.95 

.95 

1.00 

.95 

.80 

.65 

175 

.05 



1.00 

1.00 

.85 

.70 

.70 

200 

.05 

.00 



.30 

.40 

.15 

.05 

225 

.00 

.00 

.70 



.35 

.05 

.00 

250 

.05 

.15 

.60 

.65 



.40 

.05 

275 

.20 

.30 

.85 

.95 

.60 



.45 

300 

.35 

.30 

.95 

1.00 

.95 

.55 



2p 

.70 

1.70 

5.05 

4.90 

4.10 

2.65 

1.90 

Rank 

Order 

7 

6 

1 

2 

3 

4 

5 

EXPERIMENTAL   PROPORTIONS,  SUMMED   PROPORTIONS,  AND   RANK  ORDERINGS 

OF  SEVEN   RATES  BASED  ON  THE   EVALUATIONS  OF  SUBJECTS   LISTENING  TO 

THE  VARISPEECH    I   PROCESS  OF  TEMPORAL  ALTERATION 


Rates 

(WPM) 

150 

175 

200 

225 

250 

275 

300 

150 



.85 

1.00 

.95 

.90 

.85 

.50 

175 

.15 



.70 

1.00 

.60 

.20 

.20 

200 

.00 

.30 

.55 

.15 

.20 

.05 

225 

.05 

.00 

.45 



.45 

.15 

.05 

250 

.10 

.40 

.85 

.55 



.25 

.05 

275 

.15 

.80 

.80 

.85 

.75 



.05 

300 

.50 

.80 

.95 

.95 

.95 

.95 



2p 

.95 

3.15 

4.75 

4.85 

3.80 

2.60 

.90 

Rank 

Order 

6 

4 

2 

1 

3 

5 

7 

297 


subjects  listening  to  the  VOCOM-I  tape  preferred  the  rate  of  200 
words  per  minute  when  it  was  paired  with  the  rate  of  275  words 
per  minute. 

The  rank  ordering  of  the  seven  rates  investigated  reflects 
the  listening  rate  preferences  of  the  entire  group  of  subjects. 
The  slide  presents  the  most  preferred  rate  as  "1,"  and  the  least 
preferred  rate  as  "7."   To  obtain  these  rankings,  the  proportions 
in  the  table  were  summed  and  the  orderings  were  determined  from 
the  summed  proportions.   (£P) 

The  rank  orderings  in  Slide  1,  the  selective  time  alteration 
procedure,  show  that  subjects  most  preferred  200  words  per  minute, 
with  225,  250  and  275  representing  the  second,  third,  and  fourth 
choices,  and  with  150  and  175  words  per  minute  as  the  least 
preferred  rates. 

The  rank  orderings  in  Slide  2,  the  systematic  time  alteration 
procedure,  show  that  subjects  most  preferred  225  words  per  minute, 
with  200,  250,  and  175  words  per  minute  as  the  second,  third, 
and  fourth  choices,  and  with  150  and  300  words  per  minute  being 
preferred  least. 

These  data  suggest  somewhat  different  preference  ratings 
for  the  two  time  alteration  methods  under  investigation. 

Discussion 


The  results  of  the  present  investigation  for  selective 
(VOCOM-I)  time  alteration  compares  favorably  with  data 
reported  by  Cain  and  Lass  (1974)  and  by  Lass  and  Prater 
(1973)  to  the  extent  that  200  words  per  minute  was  the 
most  preferred  listening  rate.   Results  of  systematic 
(Lexicon-Varispeech  I)  time  alteration,  in  which  subjects 
chose  225  words  per  minute  as  the  most  preferred  rate, 
is  more  consistent  with  data  reported  by  Foulke  and  Sticht 
(1966)  who  found  that  their  subjects  chose  207  words  per 
minute  as  the  most  preferred  rate  of  listening. 

These  results  are  somewhat  surprising  in  that  the 
systematic  time  alteration  in  the  present  study  (Varispeech-I) 
shows  faster  rate  preferences  than  found  by  Lass  and  his 
associates  (1973) ,  who  have  found  rate  preferences  of  about 
179  words  per  minute  using  a  random  electro-mechanical 
technique  (Whirling  Dervish) . 

Several  explanations  are  available  when  considering 
the  differences  in  findings  of  our  present  study  and  the 
previous  studies  on  rate  preference,  namely:  (1)  degree 
of  compression  --  fairly  consistent  within  the  present 


298 


methods  studied,  and  between  the  present  study  and  those 

of  Lass  and  associates,  that  is,  150  -  300  words  per 

minute;  (2)   prior  experience  with  compressed  speech 

—  Lass  and  Goff  (1974)  have  reported  no  differences 

in  consistency  of  speech  rate  evaluations  of  experienced 

and  inexperienced  listeners,  therefore,  none  is  expected 

from  the  inexperienced  listeners  in  the  present  study; 

(3)   the  nature  of  the  material  to  be  comprehended  -- 

the  "Rainbow  Passage"  was  used  in  all  of  the  studies; 

and  (4)   the  quality  of  the  speaker's  voice,  —  all  speakers 

employed  in  the  present  study  and  the  previous  studies  (Lass 

et  a_l.  ,  1973)  were  professional  radio  announcers  with 

perceptually  normal  voices.   This  leaves  only  two  areas 

as  possible  variables:  (a)   the  rate  of  the  original  speech 

recording,  and  (b)   the  method  of  compression. 

Thus,  the  possibility  exists  that  the  unaltered 
rate  of  speaking  for  the  present  study  (202  wpm)  and 
that  used  by  Cain  and  Lass  (1974)  (175  wpm)  differ  enough 
to  account  for  the  choice  of  the  higher  preferred  rate  of 
225  words  per  minute  in  the  present  study.   That  is,  if 
subjects  chose  rates  most  closely  aligned  to  the  natural 
rate  of  the  unaltered  portion  of  the  passage  pair,  then 
the  findings  in  the  present  study  might  be  expected.   This 
finding  could  be  supported  by  the  work  of  Foulke  and  Sticht 
(1966) .   Further  studies  must  be  completed  to  determine  the 
role  of  the  unaltered  speaking  rate  and  the  interactions 
with  the  two  methods  of  speech  compression. 

Finally,  the  methods  of  compressing  the  passage  appear  to 
have  an  effect  on  the  present  differences  in  ratings  of 
preference.   That  is,  the  VOCOM-I  technique  time  alters 
important  speech  characteristics  used  for  intelligible, 
comprehensible,  and  therefore,  preferred  listening  rates, 
namely:   pause  times  and  vowel  durations.   Several  authors 
(Liberman,  et_  al . ,  19  63;  House  and  Fairbanks,  1953;  and 
Beasley,  1974)  have  indicated  the  relative  importance  of 
fundamental  frequency,  amplitude,  and  duration  of  voiced 
segments  of  speech  in  relation  to  consonant  sounds  for 
good  speech  perception. 

Thus,  differences  in  listener  rate  preference  between 
the  selective  and  systematic  time  alteration  methods  may  be 
explained  in  several  ways  —  in  that  the  selective  procedure 
(VOCOM-I)  removes  portions  of  the  signal  that  make  the 
passage  less  intelligible  and  more  difficult  to  understand. 
In  addition,  the  unaltered  speaking  rate  (202  wpm)  had 
consonant-vowel  transitions  and  intra-sentence  pauses 
that  were  constricted  in  the  time  domain,  thus  making 
even  more  difficult  the  listenability  of  the  vowel  compressed/ 
pause  deleted  section. 

Furthermore,  the  fast-acting  clutch  system  of  VOCOM-I 
suppresses  pauses,  but  cannot  distinguish  between  the 
various  kinds  of  pauses.   That  is,  juncture  pauses, 


299 


which  are  important  for  the  mental  comprehension  of  speech, 
cannot  be  altered  without  reducing,  or  at  least  interfering 
with,  listener  comprehension  -  and  probably  preference . 

These  awkward  reductions  in  the  important  features 
of  the  present  reading  passage  provoked  post  hoc  responses 
from  listeners  such  as,  "the  low  rates  sounded  like 
stuttering,"  or... "it  was  hard  to  listen  to  the  first  (VOCOM  I) 
recording  because  of  the  choppy- fragmented  sentence."   In  fact, 
ninety  percent  of  the  20  subjects  questioned  preferred  the 
Varispeech  I  tape  in  terms  of  overall  "listenability . " 

In  summary,  it  must  be  expected  that  original  recording 
rate  and  method  of  compression  interact  to  place  special 
constraints  on  the  pause  deletion/vowel  compression  method 
that  forces  listeners  to  accept  slower  rates  of  presentation 
as  their  most  preferred.   Therefore,  the  implications  for 
use  of  the  two  methods  must  be  based  on  the  nature  of  the 
material  to  be  heard  (prose,  technical,  lecture)  and  the 
rate  of  the  original  recording. 

Future  research  must  detail  the  differences  in  the 
two  techniques  for  different  rates  of  the  original  unaltered 
recording,  difficulty  of  material,  vocal  characteristics  of 
the  reader,  and  comprehension  of  the  various  permutations 
of  these  factors.   Furthermore ,  since  it  is  possible  to  alter 
the  pause  time  and  vowel  time  independently  on  the  VOCO'I  I 
(selective  deletion)  compressor,  it  would  be  interesting  to 
determine  if  manipulating  these  factors  separately  (e.g.,  pause 
deletion  only  or  vowel  compression  only)  would  alter  subjects' 
listening  rate  preferences. 


300 


References 


Beasley,  D.S.   Auditory  Analysis  of  Time-Varied  Sentential 
Approximations.   In  Time-Compressed  Speech/  (edited  by 
S.  Duker) ,  Metechen;  New  Jersey,  Scarecrow  Press,  692  -  701 
1974. 

Cain,  C.J.  and  N.J.  Lass,  Listening  Rate  Preferences  of  Adults. 
In  Time-Compressed  Speech.   (edited  by  S.  Duker)  ,  Metechen, 
New  Jersey,  Scarecrow  Press,  674  -  679,   1974. 

Fairbanks,  G.   Voice  and  Articulation  Drillbook.   New  York, 
Harper  and  Row,  196  0. 

Fairbanks,  Everitt,  &  Jaeger,  Time  or  frequency  compression  - 
expansion  of  speech.  Transactions  of  the  Institute  of  Radio 
Engineers  Professional  Group  on  Audio,   AV  -  2,   1954,   7-12. 

Foulke,  E.  and  T.G.  Sticht,  Listening  Rate  Preferences  of 

College  Students  for  Literary  Material  of  Moderate  Difficulty. 
Journal  of  Research,   6,  397  -  401,  1966. 

Guilford,  J. P.   Psychometric  Methods.   New  York;   McGraw  -  Hill, 
154  -  157,   1954 

House,  A.S.  and  Fairbanks,  G.   The  Influence  of  Consonant 
environment  upon  Secondary  Acoustical  characteristics  of 
vowels.   Journal  of  the  Acoustical  Society  of  America,   25, 
105  -  113,   1953. 

Hutton,  C.L.   A  Psychophysical  Study  of  Speech  Rate. 
Doctoral  dissertation,  University  of  Illinois,   1954. 

Lass,  N.J.  and  C.E.  Prater,   A  Comparative  Study  of  Listening 
Rate  Preferences  for  Oral  Reading  and  Impromptu  Speaking 
Tasks.   Journal  of  Communication,   23,  95  -  102,   1973. 

Lass,  N.J.  and  Goff,  V.L.  Consistency  of  Speech  Rate  Evaluations 
of  Experienced  and  Inexperienced  Listeners.   In  Time-Compressed 
Speech.   (edited  by  S.  Duker),  Metechen,  New  Jersey,  Scarecrow 
Press,   679  -  683,   1974. 

Lass,  N.J.  and  Fultz,  V.A.   A  Normative  Study  of  Children's 
Listening  Rate  Preferences.   Language  and  Speech,   1976 
(In  Press) 

Liberman,  A.M.,  Cooper,  F.S.,  Harris,  K.  and  MacNeilage,  P.M. 
A  Motor  Theory  of  Speech  Perception.   Proceedings  of  the 
Speech  Communication  Seminar,  Vol  II.   Stockholm:   Speech 
Transmission  Laboratory,  Royal  Institute  of  Technology,   196  3. 

Ross,  R.T.   Optimal  Orders  for  the  Presentation  of  Pairs  in  the 
Method  of  Paired  Comparisons.   Journal  of  Educational  Psychology, 
25,  375  -  382,  1934.  l 2JL 

301 


Exposure  to  Time- Compressed  Speech:     Effect  on  Subjects' 
Listening  Rate  Preferences  and  Listening  Comprehension  Skills 
by  Lass,    N.  J.  ,    Foulke,    E.  ,    Nester,    A.  A.  ,    &  Comerci,    J. 


303 


EXPOSURE  TO  TIME-COMPRESSED  SPEECH:     EFFECT  ON  SUBJECTS'   LISTENING 
RATE  PREFERENCES  AND  LISTENING  COMPREHENSION  SKILLS 


Norman  J.    Lass,    Emerson  Foulke,   Ann  A.    Nester, 
and  Joanne  Comerci 

The  purpose  of  this  investigation  was  to  determine  the  effect  of  train- 
ing by  means  of  systematic  exposure  to  time-compressed  speech  on  lis- 
teners' listening  rate  preferences  and  listening  comprehension  skills.     Two 
groups  of  subjects,    15  in  an  experimental  group  and  15  in  a  control  group, 
participated  in  the  study.     The  experimental  group  received  exposure  to 
time-compressed  prose  material  in  12  listening  sessions  over  a  six- week 
period.     The  amount  of  time  compression  was  progressively  increased  from 
the  first  to  the  sixth  week  of  exposure:    from  225  wpm  to  350  wpm  in  25  wpm 
increments  per  week.     The  control  group  received  no  exposure  to  time- 
compressed  speech.     Both  groups  were  given  listening  rate  preference  and 
listening  comprehension  tasks  before  and  after  the  six-week  period.     Re- 
sults of  their  performance  indicated  that  training  by  means  of  exposure  to 
time-compressed  speech  influenced  subjects'  listening  rate  preferences  but 
did  not  significantly  improve  their  listening  comprehension  skills.     Impli- 
cations of  these  findings  and  suggestions  for  future  inquiry  are  discussed. 


Paper  presented   at   the  Thirri   Louisville   Conference   cm  "ate-Controlled   Speech, 
November   3-5,    1975,    Louisville,    Kentucky,   and  to  be  published  in  the  Journal 
of  Auditory  Research,    1976. 


304 


INTRODUCTION 

With  the  development  of  speech  compression  equipment  which  allows  for 
the.  control  of  the  rate  of  recorded  speech,  there  has  been  an  increased 
interest  in  the  listening  rate  preferences  of  children  and  adults  (Hutton, 
1954;  Foulke  and  Sticht,  1966.;  Lass  and  Cain,  1972,  Lass  and  Prater,  1973; 
Cain  and  Lass,  1974;  Lass  and  Fultz,  1976). 

The  effect  of  training  through  exposure  to  time-compressed  speech, 
although  applied  to  the  study  of  listener  comprehension  with  inconsistent 
findings  (Foulke,  1964,  Orr,  Friedman,  and  Williams,  1965;  Voor  and  Miller, 
1965;  Friedman  et  al.,  1966),  has  never  been  investigated  systematically 
for  listening  rate  preferences.   However,  the  irfluence  of  such  exposure 
on  listeners'  listening  rate  preferences  has  been  suggested  by  the  findings 
of  Iverson  (1956)  and  Foulke  (1965)  .   The  purpose  of  the  present  investi- 
gation was  to  determine  the  effect  of  training  on  listeners'  listening 
rate  preferences  and  comprehension  of  time-compressed  speech. 


305 


METHOD 
Subjects 

Two  groups  of  subjects  were  employed  in  this  investigation:   an  experi- 
mental group  and  a  control  group .   The  experimental  group  received  training 
by  means  of  systematic  exposure  to  time-compressed  speech  while  the  control 
group  received  no  training.  Each  group  consisted  of  15  subjects,     all 

students  at  West  Virginia  University  who  had  no  previous 
exposure  to  time-compressed  speech  and  no  reported  hearing  difficulty. 


Experimental  Materials 

Listening  Rate  Preferences.   The  passage  employed  for  the  subjects' 
listening  rate  preference  judgments  was  the  first  paragraph  of  Fairbanks' 
(1960) "The  Rainbow  Passage.'1  The  passage  was  recorded  by  a  professional 
male  reader;  the  recording  of  his  reading  was  made  in  a  sound-treated 
room  at  the  Perceptual  Alternatives  Laboratory  at  the  University  of 
Louisville  using  high-quality  recording  equipment. 

Of  the  six  sentences  in  the  reading  passage,  only  the  middle  four 
sentences,  a  total  of  55  words ,  were  employed  for  time-alteration  purposes 
and  used  in  the  experiment.   The  first  and  last  sentences  were  deleted  in 
an  attempt  to  avoid  any  possible  effects  on  rate  judgments  associated  with 
initiating  and  terminating  readings. 

The  original  recording  of  the  reading  was  time-altered  by  means  of  a 
speech  compressor  (Graham,  1971)  to  yield  nine  different  rates,  from  100 
to  300  wpm  in  25  wpm  increments.   A  paired  comparison  procedure  was  employed 


306 


for  presentation  of  the  different  rates  for  subjects'  listening  rate  prefer- 
ence evaluations.   The  order  used  for  presentation  of  the  pairs  of  readings 
as  well  as  the  order  for  individual  readings  in  each  pair  was  established 
by  Rose  (1934)  to  avoid  time  and  space  errors.   The  msster  tape  contained 
36  pairs  of  readings,  with  each  of  the  nine  rates  appearing  eight  times  on 
the  tape.   In  constructing  the  mascer  tape,  a  total  of  eight  electronic 
reproductions  of  each  of  the  nine  rates  were  made  on  high-quality  recording 
equipment  m  the  Speech  and  Hearing  Sciences  Laboratory  at  West  Virginia 
University.   A  one-second  pause  was  employed  between  the  two  readings  in 
each  pair,  and  a  three-second  pause  was  inserted  between  each  pair  of 
readings.   In  addition  to  the  36  pairs  the  master  tape 

contained  a  set  of  instructions  explaining  the  subjects'  task  as  well  as 
three  pairs  of  readings  to  be  evaluated  for  practice  purposes. 

Listening  Comprehension.   The  Comprehension  subtest  of  the  Nelson- 
Denny  Reading  Test  (Brown,  1968)  was  employed  for  the  measurement  of 
listening  comprehension.   This  test  includes  eight  brief  selections  of 
scientific  and  literary  content  that  are  appropriate  in  interest  and  level 
of  difficulty  for  college  students-:.   For  each  selection,  the  test  contains 
a  number  of  multiple-choice  questions  based  on  the  information  contained  in 
the  selection.   Forms  A  and  B  of  this  test  were  employed  in  the  study.   The 
recordings  of  both  forms  were  made  by  the  same  professional  reader  who 
recorded  the  material  for  the  listening  rate  preference  task.   The  record- 
ings were  made  in  a  sound-treated  room  at  the  Perceptual  Alternatives 
Laboratory  of  the  University  of  Louisville.   These  recordings  were  time- 
compressed  to  yield  a  rate  of  275  wpm  for  the  master  tape. 


307 


Experimental  Procedure 

Slide  1  shows  the  procedure  employed  with  the  experimental  group.   The 
subjects  in  this  group  participated  in  a  total  of  16  sessions.   In  the  first 
session,  they  were  given  the  listening  rate  preference  taskj  in  the  second 
session,  they  were  given  Foni  A  of  thr  Nelson-Denny  Reading  Test.   The  next 
12  sessions  consisted  of  one-half  hour  exposure  sessions,  two  per  week  for 
six  weeks,  in  which  they  listened  to  short  stories  appropriate  in  interest 
for  college  students  under  time-compressed  conditions.   The  stories  were 
progressively  increased  in  amount  of  time  compression  from  the  first  to 
the  sixth  week:   from  225  wpm  to  350  wpm  in  25  wpm  increments  per  week. 
At  the  end  of  the  sixth  week  of  exposure  to  time  compression,  the  subjects 
were  again  given  the  listening  rate  preference  task  and  Form  B  of  the 
Nelson-Denny  Reading  Test. 

The  procedure  employed  for  the  control  group  is  shown  in  Slide  2. 
The  subjects  in  the  control  group  participated  in  a  total  of  four  sessions. 
In  the  first  two  sessions,  they  were  given  the  listening  rate  preference 
task  and  Form  A  of  the  Nelson-Denny  Reading  Test.   Then,  after  approximately 
a  six-week  period  of  time  in  which  they  were  not  exposed  to  time-compressed 
speech,  they  were  given  the  listening  rate  preference  task  and  Form  B  of 
the  Nelson-Denny  Reading  Test. 

In  the  listening  rate  preference  sessions ;  the  subjects'  task  was  to 
determine  to  which  of  the  two  rates  in  each  pair  they  preferred  to  listen. 
In  the  listening  comprehension  sessions,  they  listened  to  each  of  the 
eight  selections  and  completed  the  multiple-choice  test  questions  covering 
each  of  the  selections.   The  test  scores  for  the  eight  selections  were  summed 
to  obtain  a  single  comprehension  score  for  each  subject.  All  listening  ses- 
sions were  held  in  a  sound-treated  room  at  West  Virginia  University. 


308 


EXPERIMENTAL  GROUP 


Session  tt 

Task 

1 

Listening  Rate  Preference 

2 

Listening  Comprehension 
(Nelson-Denny  Reading  Test,  Form  A) 

SIX-WEEK  EXPOSURE  TO  TIME-COMPRESSED 
SHORT  STORIES 


225  wpm 


:i 

week  #1 

[i 

week  #2 

:i 

week  #3 

week  #4 

"I 

12> 

week  #5 

14  ) 

week  #6 

15 

16 

250  wpm 


275  wpm 


300  wpm 


325  wpm 


350  wpm 


Listening  Rate  Preference 


Listening  Comprehension 
(Nelson-Denny  Reading  Test,  Form  B) 


Slide  1. 


309 


CONTROL  GROUP 

Session  #  Task 

1  Listening  Rate  Preference 

2  Listening  Comprehension 
(Nelson-Denny  Reading  Test,  Form  A) 


SIX-WEEK  PERIOD  OF  NO  EXPOSURE 
TO  TIME-COMPRESSED  SPEECH 


3  Listening  Rate  Preference 

4  Listening  Comprehension 
(Nelson-Denny  Reading  Test,  Form  B) 


Slide  2. 


310 


RESULTS 
Listening  Rate  Preferences 

The  listening  rate  preference  judgments  of  subjects  in  the  experimental 
and  control  groups  for  the  ^wc  sessions  were  determined  from, their 
responses  on  the  listening  race  preference  tasks .   Slide  3  shows  how  this 
information  was  obtained  from  the  data.   It  contains  the  mean  proportions 
of  cases  in  the  experimental  group  in  which  a  given  rate  was  preferred 
when  paired  with  another  rate,  for  each  of  the  nine  rates  employed  in  the 
study.   To  obtain  the  proportions,  it  was  necessary  to  construct  tables 
for  each  of  the  15  subjects  in  each  of  the  two  groups  and  to  divide  the 
frequency  of  judgments  in  a  given  category  by  the  total  number  of  judgments 
made.   For  example,  in  this  slide,  the  value  .733  (Column  5,  Row  6)  is  to 
be  interpreted  as  follows:   approximately  73  percent  of  the  15  subjects 
in  the  experimental  group  listening  to  the  master  tape  preferred  the  rate 
of  200  wpm  to  the  rate  of  225  wpm. 

The  rank  ordering  of  the  nine  rates  investigated  based  on  all  of  the 
obtained  proportions  in  each  group  and  each  condition  reflects  the  listening 
rate  preferences  for  the  entire  group  and  condition.   To  obtain  the  rank 
orders  for  the  nine  rates.,  the  proportions  presented  in  each  condition 
were  summed  and  the  rankings  were  determined  from  the  summed  proportions 
(lp).   The  rank  orderings  are  given  at  the  foot  of  each  column.   For  example, 
in  this  slide,  the  summed  proportions  indicate  that  the  largest  total 
proportion  for  the  nine  rates  was  obtained  for  the  rate  of  175  wpm;  this 
finding  indicates  that  175  wpm  was  preferred  most  of  all  rates  when  matched 
with  each  of  the  other  eight  races  employed  in  the  study.   Therefore,  175 
wpm  received  the  ranking  of  "1".   The  second  largest  summed  proportion 
was  yielded  by  the  rate  of  200  wpm,  therefore,  it  received  a  ranking  of 


311 


Rates 
(wpm) 

100 

125 

150 

175 

200 

225 

250 

275 

300 

100 

— 

.933 

1.000 

1.000 

1.000 

.933 

1.000 

.733 

.600 

125 

.067 

— 

.800 

1.000 

1.000 

.800 

.467 

.133 

.067 

150 

.000 

.200 

— 

.733 

.667 

.133 

.133 

.067 

.133 

175 

.000 

.000 

.267 

— 

.067 

.133 

.133 

.000 

.000 

200 

.000 

.000 

.333 

.933 

— 

.267 

.000 

.000 

.067 

225 

.067 

.200 

.867 

.867 

.733 

— 

.533 

.000 

.000 

250 

.000 

.533 

.867 

.867 

1.000 

.467 

— 

.133 

.000 

275 

.267 

.867 

.933 

1.000 

1.000 

1.000 

.867 

— 

.000 

300 

.400 

.933 

.867 

1.000 

.933 

1.000 

1.000 

1.000 

— 

2P 

0.801 

3.666 

5.934 

7.400 

6.400 

4.733 

4.133 

2.066 

0.867 

Rank 
Order 

9 

6 

3 

1 

2 

4 

5 

7 

8 

Slide  3. 


312 


"2",  etc. 

Slide  4  contains  a  summary  of  the  rank  ordering  of  all  nine  rates, 
from  most  preferred  to  least  preferred,  for  each  of  the  two  subject  groups 
and  for  each  of  the  two  listening  sessions   Thr-  slide  indicates  that  for 
the  experimental  group,  the  ordering  of  preferred  listening  rates  differed 
considerably  between  the  pre- expos\xe  and  post -exposure  sessions.   The  most 
preferred  rate  in  the  pre-exposure  session  v as  175  wpm,  while  the  most 
preferred  rate  in  the  post-exposure  session  was  225  wpiv .   Furthermore,  the 
subjects  generally  preferred  faster  rates  ir  the  post-exposure  session, 
with  the  exception  of  the  fastest  and  slowest  rates  employed  in  the  study, 
which  they  tended  to  prefer  least  in  both  sessions. 

However,  for  the  control  group,  the  ordering  of  preferred  listening 
rates  was  similar  for  the  two  sessions.   They  preferred  most  the  rate  of 
175  wpm  in  both  sessions,  and  the  rates  of  200,  225,  275,  125,  300,  and 
100  wpm  occupied  the  same  position  in  their  preference  ordering  in  both 
sessions. 
Listening  Comprehension 

Slide  5  contains  the  mean,  standard  deviation,  and  range  values  of 
the  listening  comprehension  scores  of  both  subject  groups  in  the  two 
sessions.   The  slide  indicates  that  ooth  groups  showed  improved  compre- 
hension scores  from  the  first  to  the  second  sessions.   However,  the 
difference  in  scores  between  the  two  sessions  is  very  small  for  the  two  groups. 

Inferential  statistical  analysis ,  consisting  of  a  two-factor  (groups 
x  sessions)  analysis  of  variance  (l.iner ,  1970),  was  performed.   Results 
of  the  alanysis  indicated  that  there  were  no  significant  differences  in 
comprehension  scores  between  groups  (F=0.495,  df=l,28)  ,  between  sessions 
(F=1.287,  df=l,28),  or  the  interaction  of  groups  and  sessions  (F=0.068,  df=l,28) 


313 


EXPERIMENTAL  GROUP 

CONTROL  GROUP 

Session  #1 

Session  #2 

Session  #1 

Session  #2 

175 

225 

175 

175 

200 

250 

200 

200 

150 

200 

225 

225 

225 

175 

150 

250 

250 

275 

250 

150 

125 

150 

275 

275 

275 

300 

125 

125 

300 

125 

300 

300 

100 

100 

100 

100 

Slide  U. 


GROUP 

SESSION 

MEAN 

S.D. 

RANGE 

Experimental 

#1 
#2 

20.00 
22.00 

2.14 
1.65 

12-30 
12-28 

Control 

#1 
#2 

21.93 
22.60 

4.30 
6.04 

15-29 
12-31 

314 


Slice  5. 


DISCUSSION 

The  results  of     our     investigation  indicate  that  training  by 
means  of  exposure  to  time-compressed  speech  does  influence  subjects' 
listening  rate  preferences.   Thin  finding.,  which  provides  support  for  the 
suggestions  made  by  the  findings  of  Iverson  (1956)  and  Foulke  (1965),  should 
be  of  applied  interest  to  thos?  concerned  with  research  on  time-compressed 
speech.   It  suggests  that  listeners'  listening  rate  preferences  are  not 
necessarily  fixed  or  permanent,  but  rather  vary  with  their  exposure  to  the 
time-compression  process.   Therefore,  it  would  appear  necessary  to  specify 
the  previous  exposure  of  subjects  to  be  employed  in  a  study  before  a  decision 
can  be  reached  regarding  their  listening  rate  preferences,  or  before 
assessment  can  be  made  of  such  preferences.   Without  control  for  previous 
exposure  to  time-compressed  speech,  results  of  studies  may  be  confounded 
because  of  the  heterogeneity  of  the  subject  sample  on  this  particular 
variable. 

Since  it  has  been  shown  that  exposure  to  time-  compressed  speech 
influences  listeners'  listening  rate  preferences,  it  would  be  interesting 
to  determine  if  the  amount  of  exposure  is  also  an  important  variable.   That 
is,  do  subjects'  listening  rate  preferences  vary  differentially  with  the 
amount  of  exposure  to  time-coirpressed  speech?   If  so,  wc  would  expect 
subjects  who  are  exposed  to  titie-coinpressej  speech  for  a  longer  time  period 
than  that  employed  in  the  present  investigation  to  prefer  even  faster  rates 
than  the  subjects  in  this  investigation  It  is  suggested  that  future  research 
further  explore  this  issue. 

Despite  the  influence  of  exposure  to  time-compressed  speech  on  subjects' 
listening  rate  preferences,  such  training,  at  least  in  terms  of  the  amount 
of  exposure  and  rates  of  compression  provided  in  &8T  present  study,  does  not 


315 


significantly  improve  subjects'  listening  comprehension  skills.   However,  it 
should  be  noted  that  the  rate  employed  for  assessment  of  listening  compre- 
hension was  275  wpm,  a  rate  beyond  which  listener  comprehension  in  general 
has  been  shown  to  decrease  considerably  (Foulke  et  a3  .  ,  1962).   Perhaps 
a  different  compression  rate  for  assessment  of  comprehension,  or  a  longer 
exposure  period,  or  both,  would  have  yielded  greater  differences  between 
subjects'  pre-  and  post-exposi-re  cr-ipreh^.nsion  skills.   Future  research 
should  address  itself  to  this  issue. 

In  comparing  the  most  preferred  rates  of  subjects  in  our 
study  with  the  most  preferred  rate  reported  by  blind  listeners,  it  is 
evident  that  blind  listeners  prefer  faster  rates.   Foulke  (1965),  in 
conducting  a  survey  of  the  acceptability  of  time-compressed  speech  by 
blind  students,  found  that  when  presented  with  a  variety  of  samples  of 
time-compressed  speech,  they  preferred  a  rate  of  275  wpm  most  often. 
This  difference  between  the  listening  rate  preferences  of  the  blind  and  the 
sighted  is  not  unexpected.  Since  blind  students  receive  most  of  their 
information  by  listening,  while  sighted  students  receive  it  through  reading, 
it  is  only  natural  that  the  blind  could  tolerate  aid  prefer  faster  listening 
rates  that  the  sighted.   In  light  of  this  motivation  factor,  it  would  be 
interesting  to  determine  the  effect  of  training  by  means  of  systematic 
exposure  to  time-compressed  speech  on  Mind  listeners'  listening  rate 
preferences.   From  the  previous  findings  with  blind  subjects,  and  the 
results  of  our  present  investigation,  we  anticipate  that  the  blind  can  be 
trained  to  prefer  listening  rates  greater  than  275  -7pm.   It  is  suggested 
that  future  research  attempt  to  substantiate  or  refute  our  expectations. 


316 


REFERENCES 

Brown,  J.  I.,  The  Nelson-Denny  Reading  Test.  N.Y.;  Houghton  Mifflin,  1960c 

Cain,.  C.  J.,  and  Lass,  N.  J .  „  Listening  rate  preferences  of  ddults.  In 
Duker,   S.  (Ld.),  Time-Compressed  Speech:  An  Anthology  and  Bibliography. 
Metuchen,  N.  J.:  Scarecrow  Press,  1974. 

Fairbanks,  G., Voice  and  Articulation  Drillbqok.  "vT.Y  Harper  and  Row,  1960. 

Foulke,  E.,  The  comprehension  of  rapid  speech  by  thi  blind:  Part  II. 

Cooperative  Research  Project  No.  137G,  Washington,  D.C.:  U.S.  Department 
of  Health,  Education,  and  Welfare.  Office  of  Education,  1964 . 

Foulke,  E.,  A  survey  of  the  acceptability  of  rapid  speech.  New  Outlook 
for  the  Blind,  6o',  1965,  261-265. 

Foulke,  E.  .,  Amster.  C.  H.,  Nolan,  C.  Y.,  and  Bixler,  R.  E.,  The  comprehension 
of  rapid  speech  by  the  blind.   Exceptional  Children,  29,  1962,  134-141. 

Foulke,  E.,  and  Sticht,  T.  G.  Listening  rate  preferences  of  college  students 
for  literary  material  of  moderate  difficulty.  Journal  of  Auditory  Research. 
6,  1966,  397-401. 

Friedman,  H.  L.,  Orr,  D.  B. ,  Freedle,  R.  0.,  and  Norris ,  C.  M. ,  Further 
research  on  speeded  speech  as  an  educational  medium.  Progress  Report 
No.  2,  Grant  No.  7-48-7670-267.  Washington,  D.C.:  U.S.  Department  of 
Health,  Education,  and  Welfare,  Office  of  Education,  1966. 

Graham,  W. ,  The  Graham  compressor,  a  technical  development  of  the  Fairbanks 
method.   In  Foulke,  E.  (Ed.),  Proceedings  of  the  Second  Louisville 
Conference  on  Rate  and/or  Frequency-Controlled  Speech.   Louisville: 
University  of  Louisville,  1971,  193-195. 

Hutton,  C.  L.,  A  psychophysical  study  of  speech  rate.   Unpublished  Doctoral 
dissertation,  University  of  Illinois,  1954. 

Iverson,  L.,  Time  compression.   International  Journal  for  the  Education  of 
the  Blind,  5,  1956,  78-79. 

Lass,  N.  J.,  and  Cain,  C.  J.,  A  correlational  study  of  listening  rate 
preferences  ana  listeners'  oral  reading  rates  Journal  of  Auditory 
Research,  12,  1972,  308-312. 

Lass,  N.  J.,  and  Fultz,  V.  A.,  A  normative  study  of  children's  listening 
rate  preferences.   Language  and  Speech,  1976  (In  Press). 

Lass,  N.  J.,  and  Prater,  C.  E.,  A  comparative  study  of  listening  rate 
preferences  for  oral  reading  and  impromptu  speaking  tasks.   Journal  of 
Communication,  23,  1973,  95-102. 


317 


Orr,  D.  B.,  Friedman,  K.  L.  ,  and  Williams,  J.  C.  C,  Trainability  of  listen- 
ing comprehension  of  speeded  discourse.  Journal  of  Educational  Psychology, 
56,  1965,  148-156. 

Ross,  R.  T.,  Optimal  orders  for  the  presentation  of  pairs  in  the  method  of 
paired  comparisons.  Journal  of  Educational  Psychology,  25,  1934,  375-382. 

Voor,  J.  B.,  and  Miller,  J=  M. ,  The  effect  of  practice  on  the  comprehension 
of  worded  speech.   Speech  Moncgr^.pns.  32,  1965,  ^+52-455. 

Winer,  B.  J.,  Statistical  Principles  in  Experiment a 1  Design.   N.Y.:  McGraw- 
Hill,  1970. 


318 


Listeners'   Preferences  for  the   Rate  of  Presentation  of  Recorded 

Information 

by    Levine,    S.  J. 


319 


ABSTRACT 

LISTENERS'  PREFERENCES  FOR  THE  RATE  OF 
PRESENTATION  OF  RECORDED  INFORMATION 

By 

S.  Joseph  Levine 

Previous  studies  show  that  a  listener  has  the  po- 
tential to  receive  recorded  information  at  a  rate  far  ex- 
ceeding the  rates  that  are  used  in  conversation  and  for  the 
production  of  tape  recordings.   However,  few  studies  have 
examined  listeners'  rate  preferences.   By  using  an  experi- 
mental setting  that  allowed  listeners  to  autonomously  man- 
ipulate the  rate  of  presentation  of  recorded  information, 
it  was  proposed  that  listeners  would  manifest  a  preference 
for  rate  of  presentation  and  would  demonstrate  rate  manipu- 
lation behaviors  that  were  related  to  the  difference  be- 
tween the  initial  rate  of  presentation  of  the  recorded  in- 
formation and  the  listener's  preferred  rate. 

Forty-eight  elementary  school  children  in  the  third, 
fourth  and  fifth  grade  listened  to  a  series  of  four  recorded 
presentations.   While  listening  to  each  recorded  presenta- 
tion, the  subjects  were  allowed  to  manipulate  the  rate  of 
presentation  of  the  recording  through  the  use  of  a  speech 


320 


compressor.   Each  of  the  four  presentations  began  at  a  dif- 
ferent initial  presentation  rate.   All  subjects  listened  to 
the  same  four  recorded  passages.   The  four  initial  presen- 
tation rates  used  in  the  study  were  100  words  per  minute, 
150  words  per  minute,  200  words  per  minute,  and  275  words 
per  minute.   The  rate  manipulation  behaviors  of  each  subject 
were  recorded  on  a  strip  chart  recorder  for  later  analysis. 

The  results  of  the  analyses  of  third,  fourth  and 
fifth  grade  subjects'  rate  manipulation  behaviors  support 
the  following  conclusions: 

1.  Children  will  manipulate  the  rate  of  presenta- 
tion of  recorded  information  in  a  self-paced  listening  situ- 
ation . 

2.  Children  demonstrate  a  preference  for  rate. 

3.  The  extent  a  listener  alters  the  listening 

rate  is  positively  related  to  the  difference  between  the  in- 
itial rate  of  presentation  of  recorded  information  and  the 
listener's  manifest  preference  for  rate. 

The  findings  suggest  a  disparity  between  the  rate 
at  which  a  student  is  able  to  listen  and  the  rate  at  which 
he  prefers  to  listen  to  recorded  information.   This  study 
has  also  suggested  that  an  initial  presentation  rate  for  re- 
corded information  that  varies  greatly  from  the  listener's 
preferred  rate  of  presentation  will  stimulate  greater  rate 
change  than  an  initial  presentation  rate  that  is  close  to 


321 


the  listener's  preferred  rate.   As  such,  instructional  ma- 
terials that  are  designed  for  use  in  a  self-paced  listening 
environment  will  be  more  likely  to  be  altered  by  the  sub- 
ject toward  a  preferred  rate  when  the  initial  rate  of  pre- 
sentation is  more  different  from  the  listener's  preferred 
rate. 


322 


LISTENERS'  PREFERENCES  FOR  RATE  OF  PRESENTATION 
OF  RECORDED  INFORMATION 
S.  Joseph  Levine* 


Introduction 


The  use  of  compressed  speech  as  a  procedure  for  altering  the  rate 
of  presentation  of  recorded  material  has  been  shown  to  be  an  effective 
tool  for  increasing  the  efficiency  of  learning  through  listening. 
(Fairbanks  et  al.,  1957;  Foulke  et  al. ,  1962;  Orr  et  al. ,  1965; 
Sticht,  1968.)   Educators  are  just  beginning  to  realize  the  implications 
of  this  procedure  and  schools  are  starting  to  provide  pre-compressed 
materials  for  students.   It  has  been  shown  that  listeners  can  more  than 
double  the  rate  of  presentation  of  recorded  material  without  a  significant 
decrease  in  their  comprehension.   (Wood,  1965.)   However,  it  has  been 
assumed  that  a  listener  who  is  provided  with  a  pre-compressed  recording 
will  want  the  recorded  material  presented  at  the  fastest  rate  at  which 
he  can  attend  and  still  comprehend.   The  study  I  am  reporting  today  is 
concerned  with  the  situation  in  which  the  listener  has  autonomous 
control  over  the  rate  of  aural  presentation.   There  are  three  basic 
questions  that  the  study  examines: 

1.  Will  listeners,  if  given  the  opportunity  to  alter  the  rate  of 
presentation  of  recorded  material,  alter  the  rate? 

2.  Do  listeners  display  defined  rate  altering  behaviors  in  a 
self-paced  listening  situation? 

3.  Is  it  possible  to  determine  an  individual  listener's  manifest 


*Dr.  Levine  is  Assistant  Professor  of  Education 
at  Michigan  State  University,  East  Lansing,  Michigan. 


323 


listening  rate  preference  for  recorded  material? 

A  good  deal  of  the  studies  that  examine  the  use  of  compressed 
speech  treat  subjects  in  groups  and  examine  group  behavior  in  terms  of 
comprehension.   The  results  of  these  studies  are  often  interpreted  to 
mean  that  group  behavior  can  be  defined  and  thereby  it  is  often  used  as 
a  guide  to  the  design  of  auditory  learning  experiences.   The  question 
"What  is  the  best  rate  for  college  freshmen?"  or  "At  what  speed  should 
auditory  material  for  5th  graders  be  presented?"  is  often  asked. 

This  study,  then,  backs  up  a  step  and  asks  about  individual  behaviors. 
If,  in  fact,  it  can  be  shown  that  individuals  have  a  preference  for  rate 
of  presentation  of  auditory  material — and  this  preference  is  unique  to 
each  individual — then  it  can  be  suggested  that  group  listening  experiences 
and/or  pre-compressed  recorded  materials  may  not  be  an  efficient  vehicle 
for  learning.   Rate  altered  listening  materials  may  be  detrimental  to 
learning  if  not  used  wisely. 

By  providing  a  series  of  rate  altered  listening  segments,  each 
presented  at  a  different  rate,  a  group  of  listeners  were  individually 
given  an  opportunity  to  alter  any  or  all  of  the  segments  to  better 
accommodate  his/her  own  preference.   The  study  examined  whether  or  not 
a  manifest  preference  for  rate  of  presentation  does  exist,  whether  or 
not  a  listener  will  alter  the  rate  of  presentation  to  better  suit  his 
individual  preference,  and  if  it  is  possible  to  ascertain  what  this 
manifest  preference  is  for  individual  listeners. 

Three  variables  were  also  examined  to  better  understand  instructional 
options  that  may  become  available  when  self-paced  listening  opportunities 
are  provided  within  the  school  setting.   These  three  variables  are  the 
onset  of  manipulation  of  rate,  the  termination  of  manipulation  of  rate,  and 


324 


the  duration  of  manipulation  of  rate. 

Each  listener  was  presented  four  recorded  segments  each  presented 
at  a  different  pre-altered  rate.   For  segments  that  deviated  greatly  from 
the  manifest  preference,  it  was  expected  that  the  onset  of  the  listener's 
manipulation  of  the  rate  would  occur  early  in  the  listening  experience. 
It  was  not  known  whether  this  deviation  from  manifest  preference  would 
relate  predictably  to  the  termination  or  the  duration  of  the  manipulation. 

Methods  and  Procedures 

The  population  for  the  study  comprised  elementary  school  children 
in  grades  three,  four,  and  five  who  displayed  no  hearing  deficits.   A 
sample  of  16  subjects  from  each  grade  level  (total  sample  of  48  subjects) 
was  selected  for  participation  in  the  study.   Selection  was  made  on  a 
random  basis  from  enrollment  lists  for  each  grade  level  of  an  elementary 
school. 

Each  of  the  16  subjects  for  each  grade  level  was  randomly  assigned 
to  one  of  12  different  treatments.   One  at  a  time,  each  subject  in  the 
study  listened  through  headphones  to  a  series  of  pre-recorded  and  rate- 
altered  listening  selections.   During  the  listening  experience  the  subject 
was  given  the  opportunity  to  alter  the  rate  of  presentation  of  the 
recorded  material  by  manipulating  a  single  rate  control  knob  on  a  metal 
box.   The  metal  box  had  a  plain  appearance  and  was  constructed  so  as  to 
provide  a  minimal  distraction  to  the  subject.   All  instructions  regarding 
the  listening  experience  and  actions  to  be  taken  by  the  subject  during 
the  listening  experience  were  delivered  on  the  tape  and  thereby  standardized 
for  all  subjects. 

The  experimental  setting  consisted  of  a  simple  box  with  a  large  knob 
for  controlling  the  rate,  earphones  for  listening,  and  a  set  of  pictures  to 


325 


focus  the  visual  attention  of  the  subject.   A  speech  compressor  and  a 
chart  recorder  were  hidden  from  the  subject's  view. 

All  treatments  consisted  of  a  short  introductory  segment  to 
familiarize  the  subject  with  the  equipment  and  the  procedure  for  altering 
the  rate.   Then  followed  four  listening  segments,  a  single  story  divided 
into  4  parts. 

The  listening  segments  were  pre-altered  to  four  different  rates 
which  were  used  as  initial  presentation  rates  for  each  segment.   Once 
the  segment  began,  the  subject  was  free  to  alter  the  rate  in  any  desired 
manner . 

The  four  rates  were: 

100  wpm — expanded  rate 

150  wpm — "normal"  rate 

200  wpm — moderately  compressed  rate 

275  wpm — highly  compressed  rate 

At  the  conclusion  of  each  segment,  the  tape  instructed  the  subject 
to  return  the  rate  control  knob  to  a  zero  point.   This  insured  the 
specified  initial  presentation  rate  for  the  next  segment. 

The  twelve  different  treatments  were  different  orderings  of  initial 
presentation  rates  for  the  listening  experience.   This  was  used  in  an 
attempt  to  control  for  the  interaction  of  the  content  of  particular 
segments  of  the  listening  experience  and  the  initial  presentation  rate 
of  the  segments. 

Each  subject  in  the  study  was  provided  the  listening  experience 
individually  in  a  single  session,  and  served  as  his  own  control.   Data 
were  examined  and  analyzed  in  a  number  of  different  manners  to  yield 
information  regarding  manifest  preference  for  rate  and  rate  manipulation 


326 


behaviors  that  occur  in  a  self-paced  listening  situation. 

Criteria  were  established  for  this  study  to  define  manifest  preference 
for  rate  and  mean  manifest  rate  preference.   First,  manifest  preference 
for  rate  would  be  evident  if  the  final  rate  for  all  four  segments  of  a 
subject's  listening  experience  would  fall  within  a  band  of  40  wpm.   This 
criterion  was  established  to  guarantee  rate  alteration  of  at  least  three 
of  the  four  listening  segments,  since  the  rate  differences  between  the 
four  segments  were  all  greater  than  40  wpm. 

Next,  when  the  terminal  rate  for  one  segment  was  more  than  40  wpm 
away  from  the  band  of  rates  delimited  by  the  other  three  segments, 
the  fourth  rate  would  be  excluded  and  the  remaining  three  rates  would  be 
considered  the  manifest  preference  for  rate.   The  3  retained  rates  would 
have  to  be  within  a  band  of  40  wpm.   This  criterion  was  established  to 
provide  for  the  occurrence  of  a  single  terminal  rate  that  deviated 
greatly  from  the  band  of  the  other  three  terminal  rates. 

Three  criteria  were  defined  for  computing  the  mean  of  a  subject's 
rate  preference: 

— a  mean  would  be  calculated  only  for  those  subjects  with  demonstrated 

preference. 
— a  single  rate  could  be  excluded  when  it  deviated  greatly  from  the 

other  three. 
— at  least  3  terminal  rates  would  be  used  in  calculating  the  mean. 


Data 

I  have  selected  the  following  presentations  of  data  to  amplify  the 
major  foci  of  the  study.   A  number  of  other  examinations  were  made  and 


327 


HYPOTHETICAL  RATE-CHANGING  BEHAVIORS  FOR  A  LISTENER 
RATE  (expressed  in  words  per  minute) 


15 


-   30 


c 

0 

0 

(1) 

45 

If) 

c 

•H 

TS 

60 

0) 

to 

OT 

0) 

75 

X 

0) 

'-' 

w 

90 

105 


120 


Segment  1 
100 


Segment  2 
150 


Segment  3 
200 


Segment  4 
275 


(a) 


(a)  =  manipulation  onset 

(b)  =  manipulation  termination 

(c)  =  manipulation  duration  (for  segment  1) 

(d)  =  manifest  preference  for  rate 

(e)  =  mean  manifest  preference  rate  (for  three  segments) 

(f)  =  difference  from  preference  (for  segment  4) 

(g)  =  point  of  convergence  (for  all  segments) 
(h)  =  extent  of  movement  (for  segment  4) 


FIGURE  1 
RELATIONSHIP  OF  DEFINED  TERMS 


328 


are  reported  in  detail  in  the  complete  study.   (Levine,  1974.) 

The  study  substantiated  that  there  exists  a  manifest  preference  for 
rate  .of  presentation.   This  was  substantiated  when  it  was  found  that  all 
subjects  manipulated  the  presentation  rate  for  the  total  listening 
experience  to  evolve  a  group  of  final  selected  rates  that  was  narrower 
in  band  width  than  the  band  width  of  the  rates  used  for  the  initial 
presentations  of  the  listening  segments. 

Manifest  preference  for  rate  was  further  examined  in  relation  to 
convergence  toward  a  common  point.   It  was  found  that  82.3%  of  all  of 
the  individual  listening  segments  were  manipulated  by  the  listener 
toward  a  point  of  convergence.   This  finding  suggests  that  when  a  listener 
is  provided  with  a  listening  experience  and  is  also  provided  the  opportunity 
to  manipulate  that  rate  in  a  self-paced  manner,  the  listener's  manipulation 
will  in  most  cases  be  focused  toward  a  point  of  convergence. 


When  the  criteria  for  manifest  preference  are  applied  to  the  data,  it 
is  seen  that  the  percentage  of  listeners  demonstrating  a  manifest 
preference  for  rate  (68.75%)  is  greater  than  the  percentage  of  listeners 
not  demonstrating  a  manifest  preference  for  rate  (31.25%).   The  data 
further  supports  the  hypothesis  that  a  listener,  when  given  autonomous 
control  over  the  rate  of  presentation  of  recorded  information  in  a 
self-paced  listening  situation  will  demonstrate  a  manifest  preference 
for  rate  of  presentation. 


Of  the  33  subjects  who  demonstrated  a  manifest  preference  for  rate, 
a  mean  manifest  preference  rate  was  calculated  on  the  final  selected  rate 


329 


TABLE  1 
CONVERGENCE  BEHAVIOR  BY  NUMBER  OF  SEGMENTS 


Movement 

Toward 

Convergence 


Movement 

Away  From 

Convergence 


No 
Change 


Number  of 
Segments 


158 
(82.3%) 


22 
(11.5%) 


12 

(6.25%) 


Total  number  of  segments  =  192  (48  subjects  x  4  segments) 


TABLE  2 
DEMONSTRATED  MANIFEST  PREFERENCE  FOR  RATE 


#  of  Subjects 

Rate 

Preference 

Demonstrated 

No  Rate 

Preference 

Demonstrated 

All 
Grades 

33 

15 

Third 
Grade 

Fourth 
Grade 

Fifth 
Grade 

12 
10 
11 

4 
6 
5 

% 

of 

Subjects 

Rate 

Preference 

Demonstrated 

No  Rate 

Preference 

Demonstrated 

68.75 

31.25 

75.0 

25.0 

62.5 

37.5 

68.75 

31.25 

no 


of  all  four  segments  for  25  subjects.   Eight  subjects  had  mean  manifest 
preference  rates  calculated  using  three  final  selected  rates. 

.  The  mean  manifest  preference  rate  of  the  total  group  was  207.62  wpm. 
The  mean  manifest  preference  rate  for  the  fourth  grade  subjects  is 
significantly  higher  than  the  mean  manifest  preference  rate  for  the 
fifth  grade  group.   The  standard  deviation  of  the  total  group  was  26.87 
and  the  groupings  by  grade  level  showed  standard  deviations  of  26.92 
for  the  third  grade,  27.66  for  the  fourth  grade,  and  18.30  for  the  fifth 
grade.   The  high  standard  deviations  that  were  computed  provide  support 
that  preferred  listening  rate  is  a  highly  variable  attribute  and  that 
there  is  considerable  variance  among  third,  fourth,  and  fifth  grade 
students  regarding  mean  manifest  preference  rate. 


An  analysis  was  made  of  periods  within  each  subject's  listening 
experience  where  the  rate  was  maintained  without  manipulation  for  a 
period  of  time.   This  analysis  was  made  to  examine  whether  the  listening 
segments  were  of  ample  duration  to  allow  the  demonstration  of  manifest 
preference  for  rate.   For  each  listening  segment,  the  longest  period  of 
non-manipulation  was  identified  in  terms  of  the  position  of  its  occurence, 
Over  60%  of  the  subjects  had  the  longest  non-manipulated  duration  at  the 
end  of  the  listening  segment.   This  is  true  for  all  four  listening 
segments. 


Of  the  33  subjects  who  demonstrated  a  manifest  preference  for  rate, 
six  different  subjects  did  not  manipulate  the  rate  of  presentation  at 


331 


TABLE  3 


MEAN  MANIFEST  PREFERENCE  RATE  FOR  ALL  GRADES  COMBINED 
AND  BY  INDIVIDUAL  GRADE 


All 

Third 

Fourth 

Fifth 

Grades 

Grade 

Grade 

Grade 

N 

33 

12 

10 

11 

X  (wpm) 

207.62 

206.2125 

222.463* 

195.658* 

S.D. 

26.87 

26.92 

27.66 

18.30 

♦Significantly  different  at  the  .05  level  (p <  .05) 


332 


c 

0 

•H 

-P 

r- 

r^ 

00 

00 

m  rd 

TJ 

V£> 

v£> 

in 

•** 

ID 

0  u 

c 

• 

• 

• 

• 

3 

H 

vo 

«X> 

■<* 

o 

*» 

C  Q 

t£> 

>X> 

'X> 

\o 

VD 

0 

•H  Tl 

M  4J    CD 
+J  -H  +J 

0  w  rrl 

CD    0  rH 

•nO,    3 

CD 

.Q         D^ 

rH 

r^ 

IT) 

r- 

3   0  -H 

T3 

■H 

(N 

<7\ 

ro 

rH 

W  -P    G 

•a 

• 

• 

• 

• 

• 

(0 

•H 

<T» 

rH 

CM 

en 

CTl 

«W    t7»S 

s 

CN 

m 

(N 

ro 

CN 

0   C    1 

•H    C 

<#>  t3   0 

H  2 

tr> 

0 

C 

CJ  -P 

•H 

u  w 

C 

r^- 

00 

ID 

IT) 

<C    CD 

c 

rH 

o 

in 

CN 

(N 

tn 

•H 

• 

• 

• 

• 

• 

C 

tn 

^" 

CM 

CM 

VD 

V£> 

0 

CD 

>-\ 

a 

CQ 

C 

0 

•H 

P 

rH 

M-l    tO 

Tl 

n 

0   M 

c 

CN 

CM 

rH 

o\ 

a 

w 

n 

m 

ro 

CN 

II 

c  a 

0 

IX 

•H  TD 

W  P    CD 

+J  -H  4J 

U    CO    (13 

CD   0  <H 

•f-iO,    3 

CD 

«* 

xi      a 

rH 

rH 

3  0-H 

T3 

"* 

m 

rH 

<x> 

w  -p  c: 

-a 

■H 

rH 

<-\ 

rH 

II 

03 

•H 

m-i  tn£ 

s 

IX 

0   C    1 

•H    C 

n  !2 

en 

0 

c 

U  P 

•H 

co 

o  to 

c. 

<:  a) 

c 

CN 

rH 

<x> 

ro 

II 

en 

•H 

c 

Cn 

IX 

0 

CD 

J 

03 

c 

0 

-H 

-p 

rt  U 

p  a) 

p 

p 

p 

■p 

C  73 

c 

-a  c 

c 

x:  c 

0    S-l 

P   CD 

C   CD 

T3     CD 

■P  CD 

w  o 

CD 

M  en 

0  g 
O   tn 

U   g 

•H   Cn 

3  Cn 

u 

•H   CD 

CD   CD 

jC  ai 

0   CD 

cu 

En  CA 

CO  w 

E-"  W 

fa  c/> 

333 


any  time  during  one  of  the  listening  segments.   Of  the  six  non-manipulated 
segments,  five  were  at  the  moderately  compressed  rate  (200  wpm) ,  and  one 
was  at  the  normal  rate  (150  wpm) .   The  largest  difference  between  a 
subject's  mean  manifest  preference  rate  and  the  rate  of  presentation 
of  the  non-manipulated  segments  was  only  17  wpm.   The  average  difference 
from  preference  for  all  six  subjects  was  8.46  wpm.   This  finding  indicates 
that  those  segments  that  were  non-manipulated  were  extremely  close  to 
the  listener's  manifest  preference  for  rate. 


An  examination  of  the  subject's  rate  manipulation  behavior  yielded 
no  viable  correlations.   It  was  expected  that  negative  correlations  would 
substantiate  that  the  greater  the  difference  from  preference  the  sooner 
the  manipulation  onset  would  occur.   The  direction  of  the  correlations 
for  manipulation  termination  and  manipulation  duration  were  not  projected 
in  the  design  of  the  study.   The  relationships  were  tested  for  significance 
at  the  .05  level  of  probability. 

Though  a  series  of  listening  segments  were  needed  to  examine 
manifest  preference  for  rate,  the  use  of  more  than  a  single  listening 
segment  for  each  subject  compounded  the  analysis  of  rate  manipulation 
behavior.   Since  each  subject  in  the  study  yielded  four  separate  sets 
of  scores,  one  for  each  listening  segment,  it  was  mandatory  that  the  data 
be  blocked  in  four  separate  groupings  to  compensate  for  any  statistical 
effect  that  may  be  caused  by  pooling  all  scores  of  all  subjects  and  thereby 
counting  a  subject's  four  scores  as  four  different  subjects.   It  could  be 
assumed  that  a  relationship  would  exist  between  the  rate  manipulation 
behaviors  of  the  four  separate  segments  that  a  single  subject  experienced.   As 


334 


TABLE  5 


DIFFERENCE  FROM  PREFERENCE 
FOR  NON-MANIPULATED  SEGMENTS 


Subject  # 

Non-Manipulated 
Segment 

Difference 

from 
Preference 

8 

N  (150  wpm) 

17    wpm 

48 

M  (200  wpm) 

3.7   wpm 

10 

M  (200  wpm) 

4    wpm 

45 

M  (200  wpm) 

5    wpm 

22 

M  (200  wpm) 

6.75  wpm 

39 

M  (200  wpm) 

14.3  wpm 

X  (difference  from  preference)  =  8.46  wpm 
S.D.  =  5.24 


335 


such,  the  blocking  procedure  effectively  turned  the  analysis  of  rate 
manipulation  relationships  into  a  series  of  studies  with  correlation 
coefficients  derived  for  each  blocking  group. 

To  compensate  for  any  effect  the  blocking  may  have  had  on  the 
analysis  of  the  data,  two  different  blocking  procedures  were  used. 
First,  the  scores  of  the  listening  segments  were  blocked  according  to 
the  initial  word  rate  of  the  segment.   All  scores  for  segments  beginning 
with  the  same  initial  word  rate,  regardless  of  their  presentation  order 
within  the  total  listening  experience,  were  analyzed  as  a  group.   Next, 
the  scores  of  the  listening  segments  were  blocked  according  to  the 
presentation  order  of  the  segments.   All  scores  for  segments  in  the  same 
presentation  position,  regardless  of  initial  word  rate,  were  analyzed 
as  a  group.   In  each  of  the  blocking  procedures,  a  single  subject  was 
represented  no  more  than  one  time  in  the  computation  of  the  correlation 
coefficient.   A  total  of  eight  different  correlation  coefficients  were 
computed  for  each  rate  manipulation  behavior  due  to  the  use  of  the  two 
different  blocking  procedures. 

The  data  indicates  that  seven  of  the  eight  relationships  were 
negative  in  direction.   This  is  the  hypothesized  direction.   Only  two  of 
the  relationships  were  significant  at  less  than  the  .05  level  of  significance. 
Of  the  two  significant  relationships,  one  was  positive  in  direction  and 
the  other  was  negative  in  direction.   When  the  subjects  were  blocked 
by  presentation  order  of  the  segments  the  strongest  relationship  occurred 
for  the  first  segments  with  the  relationships  getting  progressively 
smaller  for  each  successive  blocking  of  segments.   Little  consistency 
was  shown  between  correlation  coefficients  for  the  different  blocked 
groups  which  suggests  that  there  is  little  relationship  between  manipulation 


336 


TABLE 


COMPARING  MANIPULATION  ONSET  TO  THE 
DIFFERENCE  FROM  PREFERENCE 


Blocking 
Group 

Number 
of  Cases 

Manipulation 

Onset 

(seconds) 

Correlation 
Coefficient 

100  wpm 
Segments 

31 

X  =  2.145 
S.D.  =  .93 

r  =  -.049 

150  wpm 
Segments 

30 

X  =  14.2 
S.D.  =  28.16 

r  =  -.2959 

200  wpm 
Segments 

27 

X  =  7.019 
S.D.  =  10.38 

r  =  -.2419 

275  wpm 
Segments 

30 

X  =  2.45 
S.D.  =  1.23 

r  =  +.4594* 

First 
Segments 

28 

X  =  5.839 
S.D.  =  10.33 

r  =  -.382* 

Second 
Segments 

30 

X  =  6.783 
S.D.  =  15.69 

r  =  -.237 

Third 
Segments 

29 

X  =  5.62 
S.D.  =  13.88 

r  =  -.217 

Fourth 
Segments 

31 

X  =  7.274 
S.D.  =  21.02 

r  =  -.191 

significant  at  the  .05  level  (p  <.05) 


337 


TABLE  7 


COMPARING  MANIPULATION  TERMINATION  TO  THE 
DIFFERENCE  FROM  PREFERENCE 


Blocking 
Group 

Number 
of  Cases 

Manipulation 

Termination 

(seconds) 

Correlation 
Coefficient 

100  wpm 
Segments 

31 

X  =  101.935 
S.D.  =  48.058 

r  =  -.2537 

150  wpm 
Segments 

30 

X  =  91.8 
S.D.  =  47.41 

r  =  -.145 

200  wpm 
Segments 

27 

X  =  91.46 
S.D.  =  48.967 

r  =  -.166 

275  wpm 

Segments 

30 

X  =  74.23 
S.D.  =  50.67 

r  =  -.0929 

First 
Segments 

28 

X  =  88.46 
S.D.  =  34.48 

r  =  -.205 

Second 
Segments 

30 

X  =  103.27 
S.D.  =  48.06 

r  =  -.151 

Third 
Segments 

29 

X  =  96.069 
S.D.  =  53.84 

r  =  -.299 

Fourth 
Segments 

31 

X  =  72.56 
S.D.  =  53.93 

r  =  -.182 

338 


TABLE  8 

COMPARING  MANIPULATION  DURATION  TO  THE 
DIFFERENCE  FROM  PREFERENCE 


Blocking 
Group 

Number 
of  Cases 

Manipulation 
Duration 
(seconds) 

Correlation 
Coefficient 

100  wpm 
Segments 

31 

X  =  99.79 
S.D.  =  47.96 

r  =  -.253 

150  wpm 
Segments 

30 

X  =  77.60 
S.D.  =  53.84 

r  =  -.027 

200  wpm 
Segments 

27 

X  =  84.47 
S.D.  =  50.85 

r  =  -.111 

275  wpm 
Segments 

30 

X  =  71.78 
S.D.  =  50.84 

r  =  -.1037 

First 
Segments 

28 

X  =  82.625 
S.D.  =  36.645 

r  =  -.085 

Second 
Segments 

30 

X  =  96.48 
S.D.  =  51.696 

r  =  -.068 

Third 
Segments 

29 

X  =  90.448 
S.D.  =  56.46 

r  =  -.3388 

Fourth 
Segments 

31 

X  =  65.29 
S.D.  =  54.47 

r  =  -.1065 

339 


onset  and  difference  from  preference  as  manifest  by  the  subjects. 


When  the  relationship  between  difference  from  preference  and 
manipulation  termination  is  examined,  no  significant  correlations  are 
found . 


Similarly,  no  significant  correlations  were  found  between  the 
duration  of  manipulation  and  difference  from  preference. 


Conclusions 

This  study  has  suggested  a  disparity  between  the  rate  at  which  a 
student  can  listen  to  recorded  information  and  the  rate  at  which  he 
prefers  to  listen  to  recorded  information.   Further  substantiation  of 
this  disparity  in  future  research  will  assist  in  establishing  the 
parameters  of  the  task  of  training  a  listener  to  utilize  efficient  listening 
behaviors.   Many  attempts  have  been  made  to  train  a  listener  to  comprehend 
at  high  rates  of  presentation.   These  attempts  have  met  with  varying 
degrees  of  success.   With  the  advent  of  inexpensive  speech  compression 
playback  equipment,  many  listeners  will  for  the  first  time  have  the 
opportunity  to  self-pace  the  listening  task.   Their  listening  will  not 
be  guided  by  an  understanding  of  how  fast  they  can  listen,  but  instead 
by  how  fast  they  want  to  listen. 

To  create  an  efficient  listening  environment  for  the  listener 
demands  a  training  procedure  that  will  successfully  increase  the  rate  at 


340 


which  a  listener  prefers  to  listen.   The  degree  of  efficiency  depends 
on  the  ability  of  the  training  procedure  to  move  the  preference  toward 
the  point  of  maximum  rate  input.   The  starting  point  for  the  development 
of  training  procedures  of  this  nature  is  the  establishment  of  the  limits 
of  the  training  problem.   To  evaluate  the  success  of  any  training 
procedure  demands  that  you  know  where  the  learner  is  prior  to  training 
so  that  an  assessment  of  rate  preference  change  can  be  made.   An  appropriate 
training  procedure  is  one  which  decreases  the  difference  between  manifest 
preference  for  rate  of  presentation  and  potential  rate  of  presentation. 
This  study  has  established  the  presence  of  manifest  preference  for  rate 
and  has  also  suggested  the  presence  of  a  manifest  preference  for  rate 
that  is  below  the  potential  rate  for  a  listener.   It  has  also  shown  that 
listening  rate  is  a  matter  of  individual  preference  and  it  may  not  be 
practical  to  attempt  to  match  individual  rate  preferences  with  a  finite 
selection  of  rates  on  pre-compressed  recordings. 


341 


BIBLIOGRAPHY 


Fairbanks,  G.  ,  N.  Guttman,  and  M.  S.  Miron,  "Effects  of  Time  Compression 
upon  the  Comprehension  of  Connected  Speech,"  Journal  of  Speech  and 
Hearing  Disorders,  22:10-19,  1957. 

Foulke,  E.,  C.  H.  Amster,  C.  Y.  Nolan,  and  R.  H.  Bixler,  "The  Comprehension 
of  Rapid  Speech  by  the  Blind,"  Exceptional  Children,  29:134-141,  1962. 

Levine,  S.  J.,  Listeners'  Preferences  for  Rate  of  Presentation  of 
Recorded  Information,  Research  Report,  Consortium  on  Auditory 
Learning  Materials  for  the  Handicapped,  Michigan  State  University, 
East  Lansing,  Michigan,  1974. 

Orr,  D.  B.,  H.  L.  Friedman,  and  J.  C.  C.  Williams,  "Trainability  of 

Listening  Comprehension  of  Speeded  Discourse,"  Journal  of  Educational 
Psychology,  56:148-56,  1965. 

Sticht,  T.,  "Some  Relationships  of  Mental  Aptitude,  Reading  Ability 

Using  Normal  and  Time-Compressed  Speech,"  Journal  of  Communication, 
18:243-258. 


Wood,  C.  D.,  "Comprehension  of  Compressed  Speech  by  Elementary  School 
Children,"  unpublished  doctoral  dissertation,  Indiana  Univeristy, 


1965. 


342 


The  Effect  of  Time  Compressing  an  Audiovisual  Instructional 
Program  upon  Learning  and  Retention 
by  Koskey,    B.  E. 


343 


THE  EFFECT  OF  TIMF-COMPRES5ING  AN  AUDIOVISUAL  INSTRUCTIONAL 


,1 


PROGRAM  UPON  LEARNING  AND  RETENTION 

2 
B.  Eugene  Koskey 


Abstract 

The  purposes  of  this  study  were  (1)  to  identify  an  audiovisual 
instructional  program  which  is  visually  valid,  i.e.,  one  which  con- 
tains visuals  significant  to  the  content,  and  (2)  to  time-compress 
that  program  to  determine  what  effect  compression  has  upon  compre- 
hension and  retention,  and  whether  or  not  grade  point  average  (gpa) 
has  any  effect  upon  these  functions.   Data  was  gathered  from  two 
experiments,  the  first  of  which  determined  the  visual  validity  of 
the  stimulus  material.   The  second  experiment  determined  the  results 
of  compressing  the  material  to  65%  of  the  original  time  (oot)  and 
to  50%  oot.   Results  showed  no  significant  differences  between  treat- 
ment groups,  gpa  groups,  or  in  the  interaction  between  them.   Findings 
suggest  that  it  is  possible  to  compress  an  audiovisual  instructional 
program  as  much  as  50%  oot  without  significant  loss  in  comprehension 
or  retention. 


This  study  was  the  author's  doctoral  dissertation  for  the  Indiana 
University  School  of  Education. 


2Dr.  Koskey  is  Director  of  Instructional  Development  for  the  University 
of  Wisconsin  Center  System,  Madison,  Wisconsin. 


344 


THE  EFFECT  OF  TIME-COMPRESSING  AN  AUDIOVISUAL  INSTRUCTIONAL 

PROGRAM  UPON  LEARNING  AND  RETENTION 

B.  Eugene  Koskey 

The  population  explosion  and  the  knowledge  explosion  have  caused 
educators  to  take  a  new,  more  critical  view  of  the  processes  of  edu- 
cation which  have  been  in  practice  for  decades.  Both  of  the  above- 
mentioned  factors  suggest  that  in  order  to  satisfy  current  and  future 
demands,  educational  practices  must  become  more  effective  and  more 
efficient.  Whatever  strides  may  be  made  in  effecting  learning  in  a 
shorter  time  segment  than  originally  conceived  would,  by  definition, 
contribute  to  the  overall  efficiency  of  our  educational  system. 

One  instructional  technique  which  has  received  increasing 
interest  in  the  past  few  years  is  that  of  the  time  compression  of 
speech,  more  commonly  referred  to  as  compressed  speech.  Recordings 
of  compressed  speech  permit  one  to  listen  to  material  with  reasonable 
comprehension  in  less  time  than  it  took  to  record  it  originally.  For 
example,  a  recorded  passage  which  would  normally  take  ten  minutes  to 
listen  to  may  be  heard  in,  say,  five  to  seven  minutes. 

There  are  basically  two  methods  used  to  achieve  time  compression 
of  a  recording  (Resta,  1971).  One  is  the  speed  changing  method  in 
which  one  plays  back  a  recording  at  a  higher  than  normal  rate  of 
speed.  This  method  also  raises  the  pitch  of  the  voice  and  thus  pro- 
duces a  "Chipmunk"  sound.  The  other  method  is  that  of  sampling, 
wherein  portions  of  the  recording  are  systematically  eliminated  in 
order  to  reduce  the  playback  time.  The  sampling  method  retains  the 
original  pitch  of  the  voice  and  is  the  method  with  which  the  present 


345 


study  was  concerned. 

Compressed  speech  is  relatively  new  to  the  educational  scene 
since  by  its  very   nature  it  could  not  exist  until  the  invention  of 
recorded  sound.  This  time  compression  technique  was  previously 
considered  to  be  a  breakthrough  for  the  blind,  who  rely  quite  heavily 
upon  sound  recordings  for  information,  since  their  listening  rate 
could  thereby  be  made  more  nearly  similar  to  the  rate  at  which  sighted 
persons  read.  However,  the  efficiency  afforded  blind  learners  has 
come  to  be  considered  important  for  sighted  students  as  well,  espe- 
cially since  they  are  making  greater  use  of  various  media,  especially 
the  audio  medium. 

The  present  study  dealt  with  learning  from  a  sound  filmstrip. 
The  study  grew  out  of  an  interest  in  determining  some  conditions  under 
which  time-compressed  sound  filmstrips  and  other  similar  audiovisual 
presentations  could  be  utilized  in  increasing  learning  efficiency 
without  substantially  sacrificing  comprehension  and  retention. 

Related  Research 
Although  Cramer  (1971)  reported  a  study  involving  the  time 
compression  of  speech  as  early  as  1929,  with  the  exception  of  a  few 
studies  during  succeeding  years,  it  wasn't  until  the  early  I960' s  that 
a  significant  increase  in  interest  in  compressed  speech  took  place. 
This  interest  was  partly  caused  by  the  introduction  of  two  basic  hard- 
ware designs  which  facilitated  the  sampling  method  of  compressing 
audio  recordings.  One  such  unit  was  the  result  of  experiments  by 
Fairbanks  and  his  associates  at  the  University  of  Illinois.  In  1959 


346 


they  applied  for  patents  on  a  device  which  automatically  discarded 
portions  of  the  recorded  message,  thus  decreasing  the  time  it  took  to 
play  it  back.  This  unit  has  been  referred  to  as  the  Fairbanks  compres- 
sor. 

In  1961  and  1962,  Anton  Springer  filed  for  a  series  of  patents  on 
improvements  of  rotating  heads  and  drive  assemblies  which  were  later 
incorporated  into  the  Eltro  Tempo  Regulator,  a  compression  device 
manufactured  in  Germany  (Cramer,  1971).  These  two  devices,  the  Fair- 
banks compressor  and  the  Eltro  Information  Rate  Changer,  as  the  Tempo 
Regulator  was  later  called,  have  been  the  primary  tools  used  in  the 
compression  of  audio  recordings  to  date. 

Since  the  early  sixties,  a  considerable  number  of  studies  has 
been  conducted  regarding  the  relationship  between  comprehension  and/or 
intelligibility  and  rate  of  presentation  as  well  as  relationships  be- 
tween a  number  of  other  dependent  and  independent  variables  associated 
with  compressed  speech  (Foulke,  1967,  1971).  The  efficiency  of  this 
technique  has  been  fairly  well  established  in  several  studies  that 
have  demonstrated  "...that  much  learning  can  occur  with  materials  that 
have  been  compressed  by  30-40%  csticht,  1971,  p.  89d."  Also,  Woodcock 
(1971)  has  reported  that  "...information  learned  through  the  medium  of 
control led-rate  recordings  is  retained  and  forgotten  in  much  the  same 
way  as  has  been  observed  for  learning  obtained  through  other  modes 
cp.  93d."  However,  most  of  the  research  reported  has  involved  audio 
material  exclusively.  Only  a  few  studies  have  incorporated  visual 
material  as  well. 

Some  of  these  studies  have  allowed  the  subject  (Sj  to  read  mate- 


347 


rial  while  he  was  listening  to  it  in  compressed  form.  Travers  and 
Jester  (Travers,  1964)  found  that  this  reading  along  with  compressed 
speech  at  speeds  of  200  words  per  minute  (wpm)  or  less  provided  no 
advantage  over  only  hearing  the  material.  However,  at  speeds  greater 
than  200  wpm  "...the  audiovisual  transmissions  of  information  turned 
out  to  be  superior  to  the  single  channel  cp.  376:."  Travers  concluded 
that  "...no  advantage  is  achieved  by  transmitting  redundant  information 
simultaneously,  through  both  the  auditory  and  the  visual  modality, 
except  when  unusually  high  speeds  of  transmission  are   involved 
Cp.  378D. n1 

Reid  (1971)  also  studied  the  effects  of  reading  along  while 
listening  to  compressed  speech.  She  found  that  on  a  vocabulary  test 
second  grade  children  scored  significantly  higher  following  dual 
modality  practice  on  selected  materials  than  following  single  modality 
(listening)  practice.  The  highest  rate  of  compression  used  was  225  wpm. 

Other  studies  have  dealt  with  visually  augmenting  the  spoken  pas- 
sage with  pictorial  information.  Loper  (1967)  studied  the  effect  on 
comprehension  and  retention  of  presenting  visually-augmented  compressed 
speech  passages  via  television.  The  difference  in  comprehension  be- 
tween aural-only  and  aural -visual  groups  at  the  same  rate  of  compres- 
sion was  not  significant.  However,  he  found  that  the  aural-visual 
group  performed  significantly  better  than  the  aural-only  group  in 
retention.  He  concluded  that  the  visual  augmentation  of  compressed 
speech  passages  did  not  aid  comprehension,  but  aided  retention  at 

'Italics  supplied  by  the  present  author 


348 


hi yh  rates  of  compression. 

Woodcock  and  Clark  (1968)  found  that  mental  retardates  scored 
significantly  higher,  both  when  listening  to  compressed  audio  passages 
alone  and  when  listening  to  them  with  correlated  slides,  than  when 
they  read  the  material.  The  investigators  also  reported  that  listen- 
ing with  correlated  slides  produced  significantly  higher  scores  than 
listening  alone.  Listening  presentations  ranged  from  53  to  378  wpm. 

Four  studies  were  conducted  in  which  the  visual  part  of  the 
audiovisual  presentation  v/as  programed  as  an  integral  part  of  the 
message  rather  than  as  an  adjunct  to  the  aural  recording.  In  one   of 
these  studies,  Anderton  (1970)  attempted  to  determine  the  effectiveness 
of  a  tape-slide  instructional  program  presented  at  compressed  speeds 
of  up  to  300  wpm.  He  did  not  find  any  significant  differences  in 
comprehension  as  a  function  either  of  rate  of  presentation  or  of  scho- 
lastic ability  as  measured  by  grade  point  average  (gpa).  However,  the 
group  receiving  the  material  at  300  wpm  received  it  twice,  making  the 
rate  of  250  wpm  the  highest  compression  rate  for  a  single  exposure. 

In  the  second  of  these  studies,  using  U.  S.  Air  Force  personnel 
as  Ss,  Eckhardt  (1970)  investigated  the  effect  of  using  a  multi-media 
programed  instruction  unit  on  driver  education  utilizing  slides  and 
motion  pictures  in  conjunction  with  a  compressed  audio  track.  For 
this  study  the  visuals  were  edited  by  omitting  various  slides  and 
selectively  editing  motion  picture  scenes.  Classifying  Ss  with  respect 
to  aptitude,  as  determined  by  their  scores  on  the  Armed  Forces  Qualifi- 
cation Test  (AFQT),  and  with  presentation  rate  as  the  independent 
variable,  Eckhardt  found  a  significant  interaction  between  presentation 


349 


rate  and  aptitude.  The  nature  of  the  interaction  was  that  high  apti- 
tude Ss  (AFQT  scores  of  65  to  92)  were  able  to  comprehend  the  material 
at  275  wpm  without  significant  loss,  whereas  the  "...comprehension 
loss  for  low  aptitude  Ss  r.AFQT  scores  of  10  to  30:  at  the  faster  rates 
approached  a  nonacceptable  level  [p.  85:." 

Thirdly,  Gleason,  Calloway,  and  Lakota  (1971)  conducted  a  study 
involving  the  time  compression  of  the  audio  portion  of  a  tape-filmstrip 
program  on  writing  behavioral  objectives.  The  program  depended  heavi- 
ly upon  the  visuals,  although  they  were  primarily  verbal  in  nature. 
The  program  was  constructed  to  elicit  written  responses  on  the  parts 
of  the  S_s  throughout  its  presentation.  The  difference  in  comprehension 
between  the  normal  (approximately  175  wpm)  and  accelerated  (approxi- 
mately 225  wpm)  rates  was  not  significant. 

Finally,  Parker  (1971)  studied  the  effect  of  amount  of  compression 
of  a  sound  motion  picture  upon  information  recall.  This  was  accomp- 
lished by  deleting  every   sixth  frame  of  picture  and  sound  track  for 
one  experimental  group,  and  every  eighth  and  tenth  frame  for  each  of 
two  other  experimental  groups,  respectively.  He  found  no  significant 
differences  among  the  means  of  the  control  group  and  two  experimental 
groups  receiving  the  stimulus  material  at  the  two  fastest  rates  of 
compression  (12.5%  and  12.67%). 

In  summary,  although  a  substantial  number  of  studies  dealing  with 
compressed  speech  has  been  conducted  in  recent  years,  most  of  these 
have  been  concerned  with  the  audio  medium  alone  and  wery   few  with  audio 
materials  accompanied  by  visuals.  Of  those  reported,  only  four  dealt 
with  materials  wherein  the  visuals  were  programed  as  an  integral  part 


350 


of  the  presentation.  However,  none  of  the  four  was  concerned  explic- 
itly with  whether  or  not  the  visual  portion  of  the  stimulus  materials 
actually  contributed  to  their  content. 

A  number  of  studies  has  established  that  compressed  audio  mater- 
ials can  be  used  without  significant  losses  in  comprehension.  However, 
due  to  the  small  number  of  studies  involving  compressed  audiovisual 
materials,  additional  research  seems  necessary  before  such  a  statement 
can  be  made  for  these  programs. 

Purpose  of  the  Study 
At  the  Second  Louisville  Conference  on  Rate  and/or  Frequency- 
Controlled  Speech  held  in  1969,  Boyle  (1971)  reported  the  use  of  time- 
compressed  tape-slide  presentations  at  the  University  of  Missouri 
School  of  Medicine.  These  materials  are  being  made  available  to 
medical  students  at  the  normal  rate  and  at  "...70%  compression... 
cp.  328:."   They  are  reportedly  comprised  of  visuals  having  an 
essential  role  in  achieving  the  objectives  of  the  program.  However, 
this  has  not  yet  been  substantiated  through  empirical  study.  In  fact, 
Boyle  also  reported  to  the  present  writer  that  no  studies  of  the 
relationship  between  compression  and  comprehension  of  these  materials 
have  been  conducted,  although  the  production  of  additional  materials 
in  this  format  is  an  on-going  process  at  the  School  of  Medicine.  This 
situation  may  well  exist  at  other  institutions. 

2A1 though  Boyle  does  not  indicate  precisely  what  this  means,  it  is 
presumed  that  the  material  is  presented  at  70%  of  the  original  time  (oot). 


351 


Since  vt:ry   few  studies  have  been  conducted  in  the  area  of  time- 
compressed  multi-media  instructional  materials,  and  since  as  far  as 
the  present  writer  can  determine  no  studies  have  reported  any  valida- 
tion of  the  visual  aspect  of  the  materials  used,  the  question  arose 
as  to  what  effect  the  compression  of  multi-media  instructional  materi- 
als which  contain  significant  information  in  the  visuals  has  upon 
comprehension  and  retention. 

Therefore,  the  primary  purposes  of  the  present  study  were  (1)  to 
identify  a  multi-media  instructional  program,  viz.,  a  sound  filmstrip, 
which  is  visually  val id,  i.e.,  one  in  which  the  visual  aspect  is  an 
essential  part  of  the  content;  and  (2)  to  compress  the  sound  track  of 
that  program  and  determine  what  effect  the  total  program,  with  visuals 
synchronized  with  the  accelerated  audio,  has  upon  comprehension  and 
retention.  Secondary  purposes  were  to  determine  (1)  what  relationship 
scholastic  ability,  as  measured  by  gpa,  has  with  the  comprehension  and 
retention  of  compressed  audiovisual  materials  which  are  visually  valid: 
and  (2)  whether  the  effect  of  differential  compression  is  the  same  for 
all  gpa  levels. 

Experiment  I 
Method 

Subjects.  Fifty-two  undergraduate  senior  elementary  education 
students  in  the  College  of  Education  at  the  University  of  New  Mexico 
volunteered  to  participate  in  the  study. 

Materials.  A  review  of  a  number  of  sound  filmstrips  produced  for 
the  college  level  was  made  to  determine  possible  prospects  for  the 


352 


visual  validation  process.  From  this  group  the  sound  filmstrip  enti- 
tl ed  Introduction  and  General  Principles  of  Close-Up  Photography  snd 
Copying  from  a  series  produced  by  Bailey  Films  entitled  Photography: 
Close-Ups  and  Copying  wi th  35mm  Cameras  was  selected  as  being  poten- 
tially visually  valid.  The  35mm  filmstrip  was  accompanied  by  a  33  1/3 
revolutions  per  minute  (rpm)  monaural  disc  recording  which  contained 
the  narration  and  filmstrip-advance  tones.  A  test  instrument  composed 
of  20  multiple-choice  questions  was  devised  by  the  investigator  to 
test  for  comprehension  of  the  material. 

Apparatus.  The  filmstrip  was  shown  on  a  Graflex  SVE  Model  100 
filmstrip  projector.  A  six-foot  wide  image  was  projected  on  an  eight- 
foot  white  beaded  screen  with  Ss  seated  from  16  to  30  feet  from  it. 
Filmstrip  frames  were  advanced  manually  by  the  investigator  as  the 
frame-advance  tones  were  sounded.  The  record  was  played  on  a  Newcomb 
Model  R-124  monaural  record  player. 

Procedure.  Ss  were  randomly  assigned  to  a  control  group  which 
received  both  sound  and  picture  as  was  originally  intended  by  the 
producer,  and  to  an  experimental  group  which  received  the  sound  track 
only.  The  stimulus  material  was  presented  at  the  normal  speed.  Each 
group  of  26  Ss  was  given  no  specific  instructions  before  the  presenta- 
tion other  than  being  told  verbally  that  a  test  on  the  content  would 
be  given  immediately  afterward.  The  test  instrument  was  administered 
to  each  group  immediately  after  the  presentation.  A  t-test  was  used 
to  determine  whether  or  not  a  statistically  significant  difference  in 
mean  scores  existed  between  the  two  groups.  A  split-half  method  using 
the  Spearman-Brown  prophecy  formula  (Garrett,  1962)  was  used  to  test 


353 


the  reliability  of  the  test  instrument. 
Results  and  Discussion 

The  t-test  showed  that  the  mean  number  of  correct  responses  of 
the  control  group  was  significantly  higher  than  that  of  the  experi- 
mental group  v.p_<-001).  The  results  of  the  reliability  test  of  the 
test  instrument  showed  the  reliability  coefficient  to  be  .69  for  the 
combined  audio  and  visual  portions.  Squaring  this  quantity  reveals 
that  parallel  forms  of  this  test  instrument  would  have  just  under  50% 
of  their  variance  in  common.  This  shows  the  test  instrument  to  be 
only  moderately  reliable,  and  hence  the  difference  in  mean  number  of 
correct  responses  might  have  been  even  greater  if  the  reliability  of 
the  test  were  higher.  Therefore,  it  was  concluded  that  the  visual 
portion  of  the  presentation  did  contribute  to  the  total  content,  and 
hence  the  filmstrip  was  considered  to  be  visually  valid. 

Experiment  II 
Method 

Subjects.  Ss  were  45  volunteers  from  three  education  classes 
comprised  of  undergraduate  (senior)  and  graduate  students  at  the 
University  of  New  Mexico.  None  of  these  Ss  served  in  Experiment  I. 

Materials.  The  sound  track  of  the  stimulus  material  used  in 
Experiment  I  was  transferred  from  disc  to  tape  to  facilitate  handling. 
It  was  then  time-compressed  to  two  compression  rates:  65%  oot  and  50% 
oot.  The  playback  speed  of  the  two  compression  rates  as  well  as  that 
of  the  normal  speed  copy  was  7  1/2  inches  per  second  (ips).  A  two- 
minute  passage  from  a  compressed  speech  sample  disc  recording  was  used 


354 


as  an  orientation  for  the  groups  which  experienced  the  stimulus  material 
in  compressed  form.  The  disc,  entitled  Talking  Book  No.  APH  70814,  was 
produced  by  the  U.  S.  Government  through  the  Library  of  Congress  and 
contained  excerpts  from  a  1966  issue  of  Time  Magazine.  The  rate  of 
the  sample  was  approximately  275  wpm  and  the  playback  speed  of  the  disc 
was  16  2/3  rpm. 

Apparatus.  Time-compression  of  the  audio  track  was  accomplished 
using  the  Graham  compressor,  a  modification  of  the  Fairbanks  compressor, 
at  the  Center  For  Rate  Controlled  Recordings  at  the  University  cf 
Louisville.  The  filmstrip  was  shown  on  a  Graflex  SVE  Model  100  film- 
strip  projector.  A  four-foot  wide  image  was  projected  on  a  white 
plastered  wall  approximately  16  feet  from  the  Ss.  Filmstrip  frames 
were  advanced  manually  by  the  investigator  as  the  frame-advance  tones 
were  sounded  on  the  tape.  A  Sony  Model  105  monaural  tape  recorder  was 
used  to  play  the  sound  track  versions  of  the  stimulus  material,  whereas 
a  Newcomb  Model  R-124  monaural  record  player  was  used  for  the  orien- 
tation disc  playback. 

Procedure.  Ss  were  assigned  to  three  groups  randomly.  The  stimu- 
lus material  was  presented  to  a  control  group  at  the  normal  rate,  to 
one  experimental  group  at  65%  oot,  and  to  another  experimental  group 
at  50%  oot.  Sessions  were  conducted  in  which  four  to  six  Ss  assigned 
to  a  group  received  the  presentation  at  the  same  time.  There  were  15 
Ss  in  each  group.  Before  each  presentation,  a  prepared  statement  of 
instructions  to  the  Ss  was  read  by  the  investigator  (See  Appendixes 
A  and  B).  The  experimental  groups  also  received  the  compressed  orien- 
tation sample  before  the  stimulus  material  was  presented. 


355 


Each  S_  completed  the  same  multiple-choice  test  administered  in 
Experiment  I  immediately  after  the  presentation.  An  identical  test 
was  administered  to  each  S  two  weeks  after  the  initial  testing.  To 
guard  against  contaminating  the  results  of  testing  by  including  Ss  who 
happened  to  bo  familiar  with  the  content,  after  the  presentation  and 
before  being  tested,  each  S  was  asked  to  place  an  x_  at  the  top  of  his 
paper  if  he  felt  he  knew  the  subject  matter  before  being  presented  the 
stimulus  material.  However,  those  few  (two  Ss  in  each  experiment)  who 
indicated  thusly  did  not  score  exceptionally  high;  their  scores  fell 
at  or  slightly  above  the  mean  in  each  of  their  respective  groups. 
Therefore,  the  reported  previous  knowledge  was  ignored,  and  the  scores 
of  these  Ss  were  included  in  the  final  tabulation. 

Ss  were  asked  to  state  their  gpas  and  were  divided  into  three 
levels  on  this  basis:  low,  i.e.,  below  3.0  (B);  medium,  i.e.,  3.0  - 
3.49;  and  high,  i.e.,  3.5  and  above.  The  three  levels  of  gpa  and  the 
three  treatments  were  combined  in  a  3x3  factorial  design  (Edwards,  1964). 
Two  _S_s  were  deliberately  assigned  incorrectly  with  respect  to  gpa 
level  to  achieve  equal  cell  frequency.  These  were  chosen  on  the  basis 
of  being  the  closest  S  in  their  respective  levels  to  the  level  to 
which  the  reassignment  had  to  be  made.   In  the  control  group  a  reported 
gpa  of  2.9  (falling  into  the  low  level)  was  moved  to  medium  (3.0  - 
3.49;,  and  in  group  X  c,q  (experimental  group  receiving  the  presentation 
in    cot)  a  reported  gpa  of  3.5  (falling  into  the  high  level)  was 
reassigned  to  medium. 

Although  Carroll  (1957)  has  suggested  that  the  unit  of  measurement 
of  compressed  speech  commonly  used,  viz.,  wpm,  can  be  rather  imprecise 


356 


and  would  better  be  replaced  by  syllables  per  minute,  or  even  phonemes 
per  minute,  wpm  has  nevertheless  prevailed  to  date.  However,  when  re- 
porting rate  measurements  for  audiovisual  material,  wpm  can  describe 
only  a  part  of  the  total  presentation.  The  standard  method  of  arriv- 
ing at  wpm  is  to  count  the  number  of  words  in  a  passage  and  divide 
that  number  by  the  number  of  minutes  of  total  playback  time.  However, 
when  visuals  are  added  to  a  presentation,  not  only  is  there  a  frames 
per  minute  (fpm)  count  which  may  be  considered,  but  also  there  usually 
are  pauses  within  the  total  program  for  frame  changes  and  silent  frames 
which  require  no  narration,  both  of  which  are  not  present  in  audio-only 
materials.  In  calculating  the  normal  wpm  rate  of  the  sound  filmstrip 
used  in  this  experiment  by  the  usual  method,  a  measurement  of  138  wpm 
resulted.  However,  by  subtracting  the  time  taken  by  pauses  between 
frames  and  during  silent  frames  from  the  total  playback  time,  the  re- 
sulting rate  was  165  wpm.  Considering  the  unique  characteristics  of 
audiovisual  presentations,  it  was  decided  to  use  a  different  designa- 
tion, audiovisual  measurement  (avm)  for  reporting  rates  in  this  study. 
This  term  includes  the  adjusted  wpm  combined  with  fpm.  Therefore,  the 
control  group  of  this  experiment  received  the  material  at  165/6  avm, 
viz.,  165  wpm  and  6  fpm.  Accordingly,  X^  received  the  material  at 
253/9  avm,  and  X  5Q,  at  330/12  avm. 
Results  and  Discussion 

The  results  of  the  analysis  of  variance  showed  that  neither  of 
the  two  main  effects  nor  the  interaction  between  them  was  statistically 
significant.  This  was  true  for  the  initial  test  as  well  as  for  the 
difference  between  the  two  tests,  i.e.,  the  amount  of  forgetting. 


357 


Tables  1  and  2  show  these  results. 

\-\ed.n   scores  on  the  initial  testing  for  the  treatment  effect  were 
12.6  for  the  control  group,  12.9  for  group  X^,  and  12.2  for  group 
X.5Q.  On  the  test  taken  two  weeks  later,  the  control  group  mean  was 
11.0,  that  for  group  X^  was  11.7,  and  that  for  group  Xt50  was  11.1. 
Table  3  shows  these  results. 

Since  in  both  testings  mean  scores  varied  less  than  one  response 
among  the  three  treatments,  and  since  this  variation  was  not  statis- 
tically significant  in  either  testing,  it  was  concluded  that  the 
present  study  showed  no  evidence  that  any  loss  in  comprehension  or 
retention  results  when  visually  valid  audiovisual  instructional  mate- 
rials are  compressed  to  50%  oct.  Of  course,  in  order  to  generalize 
beyond  the  results  of  this  study,  additional  research  using  other 
stimulus  materials  would  be  required. 

A  t-test  for  paired  measures  (Edwards,  1946)  was  used  to  determine 
whether  or  not  a  statistically  significant  amount  of  forgetting  had 
occurred  between  the  two  testings.  The  results  showed  the  retention 
test  scores  to  be  significantly  lower  than  those  of  the  initial  test 
(£<.01),  indicating  that  forgetting  had  in  fact  taken  place  to  a 
significant  degree.  The  differences  in  mean  scores  between  the  two 
testings  were  1.6  responses  for  the  control  group  down  to  1 . 1  for 
group  X  5Q.  This  places  the  forgetting  percentage  for  the  control 
group  at  13%  and  for  the  two  experimental  groups  at  9%.  Although  the 
retention  percentages  were  slightly  better  for  the  compressed  material 
than  for  that  presented  at  the  normal  rate,  the  difference  was  not 
significant  as  previously  mentioned.  Therefore,  the  results  agree  with 


358 


TABLE  1 

Results  of  Analysis  of  Variance  of  Number 

of  Correct  Responses  Immediately 

After  Presentation 


SOURCE  0;:   VARIATION 

SS 

df 

MS 

F 

Treatment 

3.38 

2 

1.69 

0.13 

GPA 

28.32 

2 

14.16 

1.12 

Interaction 

19.81 

4 

4.95 

0.39 

Within  Groups 

455.60 

36 

12.66 

TABLE  2 

Results  of  Analysis  of  Variance  of  Difference 
Between  Number  of  Correct  Responses  on 
First  and  Second  Testings 


SOURCE  OF  VARIATION 

SS 

df 

MS 

F 

Treatment 

2.31 

2 

1.16 

0.25 

GPA 

9.91 

2 

4.96 

1.06 

Interaction 

7.02 

4 

1.76 

0.38 

Within  Groups 

168.00 

36 

4.67 

359 


CO    to 


CD 

.cr 

CC 

+J 

4-> 

3 

<J 

0) 

00 

i- 

S- 

T3 

OJ 

C 

-Q 

o 

E 

o 

Z5 

CD 

^r 

CO 

c 

-o 

ro 

CD 

03 

O   D- 
D_  C_9 


•  "  e 

OVr- 
E   +->    CD 


4-> 

-o  O 

e 

-!-> 

O    CD 

00 

U    i- 

L. 

CD    O 

OO     U 

■+- 

00 

CD 

CD 

.e  e 

.e 

+->    03 

+-> 

CD 

E    E 

e 

O 

o 

CD    03 

CD 

S-    4-> 

1. 

o  o 

o 

U  h- 

u 

00 

oo 

II 

E 

e 

03      C\J 

03 

CD  f— 

CD 

E 

E 

s- 

S-  -i- 

4-> 

•r-     03 

00 

0J     CL 

s- 

Q. 

i — 

4- 

r—    CD 

CD 

CD    (_)  -E 

u 

4J 

-E 

-E    O 

C 

C_)    03 

o 

03    CD 

CD 

a) 

C 

i- 

<+-   -r- 

o 

O 

o 

s_ 

00 

S-    CD 

CD  .£3 

E       • 

-O    E 

03     CT> 

E    => 

CD    E 

3    e 

E  -r- 

C 

+-> 

T3 

i —     00 

+->    E 

03    CD 

OO    O 

4J   -M 

s-  a 

O 

■<-  a)  h-  -a 

<+-     00 

E 

II     O 

CD    CD 

O 

.e  .e 

>— CD 

360 


Woodcock's  statement  which  characterized  the  nature  of  learning  and 
forgetting  of  time-compressed  material  as  much  the  same  as  that  of 
material  at  normal  rates  and  in  other  formats  (Woodcock,  1971). 

Tables  1  and  2  also  show  that  the  differences  in  mean  scores  for 
the  gpa  effect  as  well  as  for  the  interaction  of  gpa  and  treatment  were 
non-significant.  Composite  gpa  mean  scores  ranged  from  11.5  for  the 
low  level  to  13.5  for  the  hj_g_h  level  on  the  initial  testing.  On  the 
retention  test  the  mean  for  the  low  level  was  10.9,  for  medium,  11.3, 
and  for  high,  11.7.  Although  one  may  observe  a  steady  rate  of  rise 
from  the  low  level  to  the  high  level  in  both  cases,  the  increase  was 
not  statistically  significant  and  therefore  supports  Anderton's  (1970) 
findings  pertaining  to  this  variable. 

In  total,  the  results  of  this  experiment  tend  to  support  the  work 
of  other  researchers  who  have  dealt  with  compressing  multi-media 
instructional  programs,  i.e.,  that  audiovisual  programs  can  be  compres- 
sed without  significant  losses  in  comprehension  and  retention,  and  that 
various  levels  of  scholastic  ability  as  measured  by  gpa  have  no  signi- 
ficant relationship  to  the  ability  to  comprehend  and  retain  information 
presented  in  this  manner. 


361 


References 


American  Psycho looical  Association,  Council  of  Editors.  Publication 
manual  of  the  American  Psychological  Association.  (Rev.  ed.)  Wash- 
ington, D.C.:  APA,  1967. 

Anderton,  R.  L.  A  study  of  the  effect  of  a_  time-compressed  tape-slide 
instructional  Program  uro_n  the  learner.  (Doctoral  dissertation, 
University  of  Colorado!  Ann  Arbor,  Mich.:  University  Microfilms, 
1970.  No.  70-16,461. 

Boyle,  G.  J.  Compressed  speech  in  medical  education.  In  E.  Foulke 
(Ed. ) ,  Proceedings  of.  the  Second  Louisville  Conference  cm  Rate 
and/or  Freo uen c v-C"on~trb  1"  1  ed  Speech . "  Lou i  s  vi  1 1  e ,  Ky .  :  Uni ver s i ty  of 
Louisville,  1971". 

Carroll,  J.  B.  Problems  of  measuring  speech  rate.  In  E.  Foulke  (Ed.), 
Proceedings  of  the  Louis vi 1 le  Conference  on  Time  Compressed  Speech. 
Louisville,  Ky.:  University  of  Louisville,  1967. 

Cramer,  H.  L.  An  introduction  to  speech  time  compression  techniques: 
The  early  development  of  speech  time  compression  concept  and  tech- 
nology. In  E.  Foulke  (Ed.),  Proceedings  of  the  Second  Louisville 
Conference  on  Rate  and/or  Frcaucncy-Controi  led  Speech.  Louisville, 
Ky.  :  University  of  Louisville,  1971 . 

Eckhardt,  W.  W. ,  Jr.  Learning  in  multi-media  programed  instruction  as 
a_  function  of  aptitude  and  instruction  rate  controlled  by  compressed 
speech.  (Doctoral  dissertation,  University  of   Southern  California) 
Ann  Arbor,  Mich.:  University  Microfilms,  1970.  Mo.  70-23,155. 

Edwards,  A.  L.  Statistical  Analysis  for  Students  in  Psychology  and 
Education.  New  York:  Rinehart,  1946. 

Foulke,  E.  (Ed.)  Proceedings  of  the  Louisville  Conference  on  Time 
Compressed  Speech.  Louisville,  Ky.  :  University  of  Louisville,  1967. 

Foulke,  E.  (Ed. )  Proceedings  of  the  Second  Louisville  Conference  on 
Rate  and/or  Frequency -Control  1 ed  Speech.  Louisville,  Ky . :  U n i v e r s i ty 
of  Louisville,  1971. 

Garrett,  H.  E.  Elementary  Statistics.  (2nd  ed.)  New  York:  David  McKay, 
1962. 

Gleason,  G.,  Calloway,  R.,  8  Lakota,  R.  Effects  of  audio  rate  compres- 
sion on  student  comprehension  and  attitudes.  Unpublished  manuscript, 
The  University  of  Wisconsin-Milwaukee,  1971. 


362 


Loper,  J.  L.  An  experimental  study  of  some  effects  of  time  compression 
upon  the  comprehension  and  retention  of  a_  yjsuaij_y_  augmented  tele- 
vised soecch .  (Doctoral  dissertation,  University  of  Southern  Cali- 
fornia] Ann  Arbor,  Mich.:  University  Microfilms,  1967.  No.  67-6,505. 

Meyer,  W.  J.  Inferential  Statistics.  In  Van  Dal  en ,  D.  D.  Understanding 
Educational  Research.  New  York:  McGraw-Hill,  1962. 

Parker,  P.  J .  The  effect  of_  varying  degrees  _o_f  compression  i_n  a_  1 6mm 
sound  notion  pi  cture  unon  information  recal  1  .""(Doctoral  dissertation, 
Indiana  University)  Ann  Arbor,  Mich.:  University  Microfilms,  1971. 
No.  72-1,524. 

Reid,  S.  L.  The  effect  on_  reading  achievement  of  reading  paced  by_ 
compressed  speech . ~f  Doctoral  dissertation,  Indiana  University)  Ann 
Arbor,  Mich.:  University  Microfilms,  1971.  No.  71-24,562. 

Resta,  P.  E.  The  effects  of  training  on  the  intelligibility  and  compre- 
hension of  frequency-shifted  time-compressed  speech  by  the  blind. 
In  E.  Foulke  (Ed.),  Proceedings  o_f  the  Second  Louisville  Conference 
on  Rate  and/or  Frequency-Control  led  Sceech.  Loui svi lie,  Ky . :  Uni ver- 
sity  of  Louisville,  1971. 

Sticht,  T.  G.  Studies  on  the  efficiency  of  learning  by  listening  to 
time-compressed  speech.  In  E.  Foulke  (Ed.),  Proceedings  of  the  Second 
Louisvil le  Conference  on  Rate  and /or  Freruency-Control led  Speech. 
Louisville,  Ky.:  University  of  Louisville,  1971. 

Travers,  R.  M.  W.  The  transmission  of  information  to  human  receivers. 
AV  Communication  Review,  1964,  12.(4).  373-335. 

Woodcock,  R.  W.  The  application  of  rate-controlled  recordings  in  the 
classroom.  In  E.  Foulke  (Ed.),  Proceedings  of  the  Second  Louisville 
Conference  on  Rate  and/or  Frequency-Control  red  Sneech  .  Loui  svi 11 e , 
Ky.:  University  of  Louisville,  1971. 

Woodcock,  R.  W. ,  &  Clark,  C.  R.  Influence  of  presentation  rate  and 
media  on  the  comprehension  of  narrative  material  by  adolescent 
educable  mental  retardates.  Institute  en  Mental  Retardation  and 
Intellectual  Development,  1953,  5(7)7" 


363 


The  Effect  of  Different  Levels  of  Audio  and  Video  Compression 
upon  a  Televised  Demonstration  in  Microbiology 
by  Blind,    M. 


365 


THE  EFFECTS  OF  DIFFERENT  LEVELS  OF  AUDIO  AND  VIDEO 
COMPRESSION  UPON  A  TELEVISED  DEMONSTRATION 
IN  MICROBIOLOGY 

MaryAnn   Blind,    Ph.D. 

The  purpose  of  this  study  was  to  assess  the  differential  effects  upon 
learning  of  instructional  television  presentations  composed  of  various  com- 
binations of  compressed  and  uncompressed  audio  and  video  components. 
There  were  two  dependent  variables,    score  on  a  paper  and  pencil  posttest 
which  tested  verbal  content,   and  accuracy  of  performance  of  procedures 
demonstrated  on  the  videotapes.     There  were  three  independent  variables, 
the  various  combinations  of  compressed  audio  and  video  components,   the 
presence  or  absence  of  a  pretest,    and  the  order  of  taking  the  posttest  and 
performance  of  procedures. 

Procedure 


The  subjects  involved  were  250  students  enrolled  in  a  Microbiology 
course.     The  stimulus  materials  employed  were  color  videotapes  on  the  sub- 
ject of  staining  bacterial  slides.     The  video  portion  of  the  videotape  was 
"compressed"  by  editing  the  original  uncompressed  video  so  that  its  content 
was  coordinated  with  the  content  of  the  compressed  audio,    for  compression 
values  of  66  2/  3  %  or  50  fa.     For  the  videotapes  composed  of  100  %  video  and 
66  2/  3  7o  or  50  %  audio,    the  compressed  audio  was  dubbed  onto  the  uncom- 
pressed video  in  the  appropriate  places. 

Results  and  Conclusions 

The  results  of  the  posttest  were  analyzed  by  analysis  of  variance,   and 
the  results  of  the  performances  were  analyzed  by  chi- square.     The  results 
of  the  analyses  of  variance  suggest  that  learning  occurred  as  readily  with 
compressed  material  as  with  uncompressed  material.     The  results  of  the 
chi- square  analyses  indicate  that  when  the  values  of  both  the  audio  and 
video  components  were  compressed  to  the  same  degrees,   learning  occurred 
as  readily  with  compressed  and  with  uncompressed  material.     However, 
when  the  values  of  the  audio  and  video  components  were  compressed  to 
different  degrees,   learning  occurred  more  readily  with  compressed  audio 
and  uncompressed  video  than  with  uncompressed  audio  and  uncompressed 
video.     This  result  seems  to  be  concerned  not  with  the  compression  variable 
per  se  but  rather  with  the  temporal  relation  of  the  audio  and  video  com- 
ponents.    This  could  be  due  to  the  fact  that,   when  in  matching  the  compressed 
audio  to  the  uncompressed  video,    the  verbal  explanation  of  a  step  was  mostly 
given  before  showing  the  performance  of  that  step,   thereby  leaving  the  sub- 
ject free  to  concentrate  on  the  performance  itself. 


366 


The  Effect  of  Different  Levels  of  Audio  and  Video  Compression 
Upon  a  Televised  Demonstration  in  Microbiology. 

MaryAnn  Blind 

Any  major  addition  to  the  world  of  instructional  technology  initiates 
numerous  studies,  often  deadends,  into  the  possible  applications  of  that 
addition.   After  the  initial  impact  of  compressed  speech  on  learning  for 
the  blind,  sporadic  studies  have  occurred  concerning  the  combining  of 
compressed  speech  with  visuals.   These  visuals  range  from  filmstrips  and 
slides  to  film  and,  more  recently,  television. 

Research  in  the  combining  of  compressed  speech  and  television  is 
thin  and  inconclusive.   Loper  (1967)  studied  the  time  compression  of 
factual  messages  presented  over  a  television  system  to  determine  whether 
this  provided  increased  efficiency.   His  results  were  generally  incon- 
clusive, in  part  due  to  weaknesses  in  his  study.   These  weaknesses  were 
listed  by  Benz  (1971)  who  reviewed  Loper's  study,  viz.,  (a)  the  message 
might  have  required  little  or  no  visual  augmentation  since  it  was  origi- 
nally intended  to  communicate  only  by  aural  means,  (b)  the  dictum  (that 
Loper  followed)  that  the  picture  must  change  every  fifteen  seconds  might 
have  taken  precedence  over  the  demands  of  the  content,  and  (c)  the  video 
was  not  of  high  quality  which  may  have  resulted  in  little  significant 
information  being  transmitted.   Benz  undertook  research  to  test  the 
effect  of  compressed  speech  in  a  televised  presentation  on  the  compre- 
hension of  factual  material.   Benz  had  somewhat  better  results  than  Loper. 
Benz  concluded  that  televised  lecture  presentations  which  require  a  visual 
complement  can  be  increased  in  word  rate,  with  a  reduction  in  presentation 
time  by  one-third  without  any  significant  loss  in  comprehension.   However, 
although  he  did  not  repeat  any  of  Loper's  errors,  both  studies  had  one 
major  flaw  which  qualifies  any  significant  findings.   That  is,  neither 

367 


message  required  television — the  moving  image — as  the  best  method  of 
transmission.   Both  presentations  consisted  of  series  of  still  visuals. 
In  fact,  in  Benz's  study,  a  sketch  of  the  lecturer  was  on  the  screen 
when  no  other  visuals  were  necessary. 

In  planning  the  present  study,  therefore,  a  message  which  required 
television  was  used. 

PURPOSE 
The  purpose  of  this  study  was  to  assess  the  differential  effects 
upon  learning  of  instructional  television  presentations  composed  of 
various  combinations  of  compressed  and  uncompressed  audio  and  video 
components.   The  compression  of  each  component  is  defined  as  percentage 
of  the  original  recording  time  (104  words-per-minute) — viz.,  100%, 
66  2/3%,  and  50%.   The  subject  of  the  stimulus  materials  concerned  the 
demonstration  of  certain  techniques  in  microbiology  which  require  precisi 
manipulation  of  instruments.   Criterion  data  was  gathered  through  the  usi 
of  multiple  choice  tests  and  evaluation  of  the  performance  of  the  tech- 
niques by  each  student. 

METHOD 

This  section  contains  a  description  of  the  subjects,  the  stimulus 
materials,  the  apparatus,  and  the  procedures  used  in  the  study. 
Subjects 

The  subjects  were  250  students  at  Indiana  University,  primarily 
juniors  and  seniors,  most  of  whom  were  majoring  in  biology  or  related 
sciences.   These  students  were  enrolled  in  Microbiology  Laboratory  M255, 
an  introductory  laboratory. 


368 


Stimulus  Materials 

The  stimulus  materials  employed  were  color  videotapes  on  the  sub- 
ject of  staining  bacterial  slides.   There  were  five  videotapes  consisting 
of  five  combinations  (one  each)  of  values  of  compression  of  the  two  com- 
ponents, as  follows: 

1.  100%  audio,  100%  video 

2.  66  2/3%  audio,  66  2/3%  video 

3.  50%  audio,  50%  video 

4.  66  2/3%  audio,  100%  video 

5.  50%  audio,  100%  video. 

Combinations  one,  two,  and  three  have  equal  compressions  of  the  audio 
and  the  video,  and  combinations  four  and  five  have  uncompressed  video 
(100%)  with  compressed  audio  (66  2/3%,  50%).   These  two  combinations 
have  the  audio  segmented  and  matched  to  the  video.   This  results  in  much 
or  all  of  the  verbalizing  occurring  just  at  the  beginning  of  the  demon- 
stration of  each  step. 

The  combinations  of  particular  interest  are  those  numbered  one,  two, 
and  three.   The  primary  interest  in  these  combinations  was  based  upon  the 
fact  that  they  represented  the  same  amount  of  compression  of  both  the 
video  and  audio  components  of  the  presentation.   Combinations  four  and 
five  were  used  primarily  for  control  purposes,  viz.,  to  separate  the 
effects  of  the  degree  of  compression  of  the  audio  from  the  effects  of 
the  degree  of  compression  of  the  video. 

The  subject  of  the  videotaped  materials  was  the  performance  of  and 
rationale  for  the  procedures  used  in  staining  bacterial  slides.   Tele- 
vision was  chosen  because  it  is  said  that  the  performance  of  the  staining 
procedures  involves  certain  critical  movements  whose  nature  cannot  be 
communicated  easily  to  students  without  using  a  medium  that  permits 

369 


motion  to  be  depicted.   The  videotaped  materials  were  designed  to  enable 
the  students  to  achieve  four  types  of  learning — viz.,  motor  chains, 
verbal  chains,  discriminations,  concepts.   The  acquisition  of  verbal 
chains,  discriminations,  and  concepts — the  rationale  for  the  staining 
procedures — was  tested  with  a  paper  and  pencil  test. 

Each  of  five  groups  of  subjects  listened  to  a  separate  one  of  the 
five  videotapes  described  above.   Each  of  these  five  groups  was  divided 
into  two  subgroups — viz.,  subjects  who  received  a  paper  and  pencil  pre- 
test and  subjects  who  did  not  receive  a  pretest.   This  was  done  to 
determine  to  what  degree  the  pretest  might  sensitize  subjects  and  thus 
affect  their  performance  of  the  staining  procedures  or  the  later  paper 
and  pencil  test.   Each  of  these  two  subgroups  were  then  divided  further 
into  two  sub-subgroups.   This  last  division  determined  in  which  order  the 
subject  was  tested.   One  sub-subgroup  took  the  paper  and  pencil  test 
first  and  then  performed  the  staining  procedures.   The  second  sub-subgroup 
performed  the  staining  procedures  first  and  then  took  the  paper  and  pencil 
test.   In  this  way,  any  effect  of  either  of  the  dependent  variable  measurii 
operations  on  the  values  of  the  other  dependent  variable  could  be  deter- 
mined.  Therefore,  there  were  20  treatment  groups. 

The  audio  portion  of  the  videotape  was  compressed  at  the  Perceptual 
Alternatives  Laboratory,  University  of  Louisville,  Louisville,  Kentucky. 
The  video  portion  of  the  videotape  was  "compressed"  by  editing  the  ori- 
ginal uncompressed  video  so  that  its  content  is  coordinated  in  time  with 
the  content  of  the  compressed  audio,  for  compression  values  of  66  2/3% 
or  50%.   For  the  videotapes  composed  of  100%  video  and  66  2/3%  or  50% 
audio,  the  compressed  audio  was  dubbed  onto  the  uncompressed  video  in 
the  appropriate  places.   In  other  words,  the  content  of  the  audio  was 
matched  with  the  content  of  the  video. 


370 


The  paper  and  pencil  test  consisted  of  multiple  choice  questions 
covering  the  verbal  chains,  discriminations  and  concepts  that  were  to 
be  learned.  The  performance  of  the  staining  procedures  constituted  a 
test  of  the  acquisition  of  motor  chains.  Each  subject  was  given  a  list 
of  the  main  steps  involved  in  the  staining  of  bacterial  slides  at  the 
time  of  this  performance.  Each  main  step  involved  two  or  more  sub-steps, 
which  were  not  on  the  list. 

The  instructions  to  the  subject,  the  paper  and  pencil  test(s),  and 
the  list  of  main  steps  were  included  in  a  booklet. 

A  professional  reader  recorded  the  audio  at  104  WPM.   The  audio- 
tape was  then  compressed  to  66  2/3%  and  to  50%  of  the  original  time. 
The  discard  interval  was  20  msec. 
Apparatus 

The  audio  portion  of  the  videotape  was  recorded  in  a  high  quality 
sound  studio  at  Indiana  University.   The  material  was  recorded  at 
7  1/2  ips,  fulltrack,  using  an  Ampex  tape  deck. 

The  material  was  then  compressed  by  being  reproduced  on  a  Crown 
tape  playback  machine,  series  700.   The  signal  was  fed  through  a  CBS 
Audimax  III  automatic  level  control,  Model  444,  to  the  input  of  a  Graham 
speech  compressor.   The  output  was  fed  through  a  Langevin  program 
equalizer,  Model  EQ-258-A,  to  the  input  of  a  Crown  tape  recorder,  series 
700.   The  output  of  the  tape  recorder  was  monitored  by  an  Altec  monitor- 
amplifier,  Model  1591A,  driving  an  acoustic  research  speaker  system. 
Model  AR2ax. 

The  audio  for  the  videotapes  was  then  dubbed  onto  videotapes  re- 
corded in  a  high  quality  television  studio  at  Indiana  University  on  a 
Quad  Ampex,  Model  1200,  and  edited  by  a  Mark  II  Editech  System.   This 


371 


same  editing  system  was  used  to  "compress"  the  video  portions  of  the 
presentation. 

The  videotapes  were  then  dubbed  onto  cassettes  and  played  back  on 
a  Sony,  Model  1800,  cassette  videotape  recorder,  and  viewed  on  a  Sony 
color  monitor. 

The  experiment  was  conducted  in  a  microbiology  laboratory  under 
normal  class  conditions.   The  laboratory  contains  30  carrels  and  one 
videotape  recorder  and  monitor.   Each  carrel  contains  an  audiotape 
deck  and  laboratory  equipment.   The  laboratory  is  individualized  with 
one  student  per  carrel.   Each  experience  is  directed  by  the  audiotapes. 
Procedure 

Ten  sections  of  the  microbiology  course  were  used.   Each  section 
had  20-30  students.   The  twenty  treatments  were  randomly  assigned  to 
the  class  sections  with  the  restriction  that  both  orders  were  used  in 
each  class  section.   This  resulted  in  the  assignment  of  one  videotape 
either  with  or  without  pretest  to  each  of  the  10  laboratory  sections. 
Students  in  five  of  the  sections  took  a  pretest;  students  in  the  other 
five  sections  did  not.   Approximately  half  the  students  in  each  section 
took  the  posttest  before  they  performed  the  staining  procedures;  the 
other  half  of  the  students  performed  the  staining  procedures  before 
they  took  the  posttest. 

Each  subject  entered  the  laboratory,  went  to  a  carrel,  put  on  ear- 
phones, switched  on  the  tape  recorder,  and  followed  the  recorded  direc- 
tions for  classroom  activity.   At  one  point  on  this  tape,  the  subject 
was  directed  to  get  a  booklet  from  a  laboratory  assistant  and  follow 
the  directions  in  the  booklet.   When  the  student  completed  the  perfor- 
mance of  procedures,  he  was  directed  by  his  booklet  to  ask  a  laboratory 


372 


assistant  to  look  at  the  results  of  his  performance  and  put  a  grade 
(pass  or  fail)  on  the  cover  of  the  booklet.   The  subject  then  followed 
the  rest  of  the  instructions  in  the  booklet,  and  turned  the  booklet  in 
to  one  of  the  laboratory  assistants.   All  laboratory  assistants  had  the 
same  preset  list  of  criteria  for  evaluating  the  product  of  the  perfor- 
mance, that  is,  for  evaluating  the  quality  of  the  gram  stain.   The 
laboratory  assistants  were  trained  in  the  use  of  the  criteria  to  insure 
reliability. 

The  items  on  the  posttest  were  paraphrases  of  the  items  on  the 
pretest,  and  the  posttest  items  were  in  a  different  (random)  order  than 
were  the  pretest  items.   The  subjects  were  interviewed  to  determine  the 
extent  to  which  crosstalk  might  have  occurred. 

RESULTS 

There  are  four  three-way  analyses  of  variance  to  determine  the 
significance  of  experimental  effects  with  respect  to  the  paper  and 
pencil  test,  four  chi-square  analyses,  each  involving  a  four-way  contin- 
gency table  to  determine  significance  with  respect  to  the  performance  of 
the  staining  procedures,  and  a  point  biserial  correlation  to  determine 
the  magnitude  of  the  relationship  between  the  results  of  the  paper  and 
pencil  test  and  the  performance.   There  is  also  an  ex  post  facto  analysis 
of  variance  to  determine  whether  there  was  a  significant  gain  in  mean 
paper  and  pencil  score  from  pretest  to  posttest,  and  interviews  to  deter- 
mine the  extent  to  which  crosstalk  might  have  occurred.   In  all  approp- 
riate cases,  alpha  was  preset  at  .05.   The  four  analyses  of  variance  and 
four  chi-square  analysis  deal  with 

1.   Equal  compression  values  of  audio  and  video  components — 
a.   100%  audio,  100%  video 


373 


b.  66  2/3  %  audio,  66  2/3%  video 

c.  50%  audio,  50%  video 

2.  Uncompressed  video  components — 

a.  100%  audio,  100%  video 

b.  66  2/3%  audio,  100%  video 

c.  50%  audio,  100%  video 

3.  66  2/3%  compression  value  of  the  audio  component  and 
66  2/3%  and  100%  compression  values  of  the  video  component — 

a.  66  2/3%  audio,  100%  video 

b.  66  2/3%  audio,  66  2/3%  video 

4.  50%  compression  value  of  the  audio  component  and  50%  and 
100%  compression  values  of  the  video  component — 

a.  50%  audio,  100%  video 

b.  50%  audio,  50%  video. 
Analyses  of  variance 

In  all  four  analyses,  there  were  no  significant  F-ratios  for  main 
effects.   Therefore,  there  was  no  evidence  of  overall  effect  on  learning 
of  various  degrees  of  compression  of  the  audio  and/or  video  components  of 
the  presentation;  there  was  no  evidence  of  an  overall  effect  on  learning 
as  a  function  of  the  presence  or  absence  of  a  pretest;  and,  there  was  no 
evidence  of  an  overall  effect  on  learning  as  a  function  of  the  order  of 
occurrence  of  the  paper  and  pencil  posttest  and  the  performance  of  the 
staining  procedures.   There  were  also  no  significant  F-ratios  for  inter- 
actions in  the  first,  third,  and  fourth  analyses.   There  was  one  signi- 
ficant F-ratio  in  the  second  analysis  for  an  interaction,  viz.,  that 
between  degree  of  compression  and  order  of  taking  the  posttest  and 
performing  the  staining  procedures.   A  Neuman-Keuls  test  of  means  showed 


374 


that  the  effect  of  the  order  variable  was  non-significant  at  each  of  the 
values  of  the  audio  compression  variable  and  suggests  that  perhaps  the 
interaction  was  significant  by  chance. 

The  results  of  these  four  analyses  of  variance  suggest  that  learning 
occurred  as  readily  with  compressed  material  (156  WPM  and  208  WPM)  as  with 
uncompressed  material  (104  WPM) . 
Chi-square  Analyses 

These  analyses,  which  tested  the  results  of  the  performance  of  the 
staining  procedures,  were  done  using  distribution-free  (i.e.,  nonpara- 
metric)  tests  of  hypotheses  concerning  main  effects  and  interactions 
ordinarily  tested  by  analysis  of  variance  (Wilson,  1956) . 

Tor  the  first  analysis,  dealing  with  equal  compression  values  of 
audio  and  video  components,  there  were  no  significant  chi-squares  for 
main  effects  nor  for  interactions.   Even  though  the  total  of  all  the 
interactions  was  significant,  no  single  interaction  was  significant. 

For  the  second  analysis,  dealing  with  uncompressed  video  components, 
there  was  one  significant  chi-square,  for  the  main  effect  of  degree  of 
compression,  but  none  for  any  interaction.   Again,  even  though  the  total 
of  all  the  interactions  was  significant,  no  single  interaction  was  sig- 
nificant.  An  examination  fo  the  percentages  of  subjects  judged  to  have 
performed  the  staining  procedures  satisfactorily  at  the  various  degrees 
of  compression  indicates  that  more  learning  occurred  at  compression 
values  of  100%  video  and  66  2/3%  audio  than  at  values  of  100%  video  and 
50%  audio  and  at  values  of  100%  video  and  100%  audio,  and  that  more 
learning  occurred  at  values  of  100%  video  and  50%  audio  than  at  values 
of  100%  video  and  100%  audio. 

For  the  third  analysis,  dealing  with  an  audio  compression  val  .  of 
66  2/3%  and  video  compression  values  of  100%  and  66  2/3%,  there  were  two 

375 


significant  chi-squares,  one  for  the  main  effect  of  degree  of  compression 
and  the  other  for  an  interaction,  viz.,  that  between  degree  of  compres- 
sion, presence  of  pretest,  and  order.   In  the  case  of  the  significant 
main  effect,  an  examination  of  the  percentages  of  subjects  judged  to  have 
performed  the  staining  procedures  satisfactorily  at  the  various  degrees 
of  compression  indicates  that  more  learning  occurred  at  compression  value 
of  100%  video  and  66  2/3%  audio  than  at  66  2/3%  video  and  66  2/3%  audio. 
In  the  case  of  the  significant  interaction,  an  examination  of  the  per- 
centages of  subjects  judged  to  have  performed  the  staining  procedures 
satisfactorily  indicates  that  the  main  source  of  the  interaction  is  the 
fact  that  on  the  one  hand,  when  subjects  had  taken  the  pretest,  the  mag- 
nitude of  the  superiority  of  the  100%  video  condition  to  the  66  2/3%  vide 
condition  was  very  nearly  the  same  for  both  orders  of  the  two  dependent 
variable  measures,  whereas  on  the  other  hand,  when  subjects  had  not  taken 
the  pretest,  the  magnitude  of  the  superiority  of  the  100%  video  condition 
was  much  greater  when  the  performance  of  the  staining  procedures  occurred 
first  and  the  taking  of  the  posttest  occurred  second  than  when  the  order 
was  reversed.   The  simplest  characterization  of  this  interaction  is  per- 
haps that  the  superiority  of  the  100%  video  condition  is  especially  great 
when,  prior  to  the  performance  of  the  staining  procedures,  subjects  have 
taken  neither  the  pretest  nor  the  posttest. 

For  the  fourth  analysis,  dealing  with  an  audio  compression  value 
of  50%  and  video  compression  values  of  100%  and  50%,  there  were  no  sig- 
nificant chi-squares  for  main  effects  nor  for  interactions. 
Pretest  and  Posttest  Analysis  of  Variance 

A  cursory  examination  of  the  pretest  results  revealed  that  many  of 
the  pretest  scores  were  quite  high.   Therefore,  the  decision  was  made 
to  conduct  an  analysis  of  variance  for  the  purpose  of  determining 


376 


whether,  for  those  subjects  who  took  the  pretest,  there  was  evidence 
that  a  statistically  significant  amount  of  learning  occurred.   Clearly, 
if  there  were  no  evidence  that  learning  occurred,  it  would  not  be 
meaningful  to  report  the  effects  of  compression.   The  five  different 
combinations  of  audio  and  video  compression  values  were  combined  with 
the  two  tests,  pre-  and  posttest,  in  a  5X2  factorial  design.   The  results 
of  this  analysis  showed  that  the  mean  posttest  score  was  significantly 
greater  than  the  mean  pretest  score.   The  interaction  was  not  signifi- 
cant.  Thus,  there  is  evidence  that  learning  did  occur.   The  pretest 
mean  number  of  correct  responses  was  10.45,  while  the  posttest  mean  was 
12.84.   Thus,  there  was  an  average  increase  from  pre-  to  posttest  of 
2.38 "items.   Since  the  maximum  score  was  15,  the  greatest  possible 
average  increase  was  4.54  items.   Therefore,  on  the  average,  subjects 
learned  52%  of  what  they  could  have  learned.   Of  course,  the  fact  that 
the  mean  pretest  score  was  quite  high  makes  the  experiment  less  satis- 
factory than  it  would  have  been  otherwise.   Since  there  was  relatively 
little  to  be  learned,  we  would  expect  a  "ceiling  effect,"  which  might 
lead  to  a  smaller  likelihood  of  there  being  significant  differences. 
Point  Biserial  Correlation 

A  point  biserial  correlation  coefficient  was  computed  to  deter- 
mine the  magnitude  of  the  relationship  between  score  on  the  paper  and 
pencil  posttest  and  whether  the  performance  of  the  staining  procedures 
was  judged  to  be  satisfactory.   The  value  of  this  correlation  was  -.09. 
The  significance  of  this  correlation  was  tested  by  using  1/W  N  as  its 
standard  error.   The  value  -.09  was  not  significantly  different  from 
zero.   This  result  suggests  that  the  posttest  score  and  the  dichotomous 
variable  of  whether  the  performance  of  the  staining  procedures  is 
satisfactory  are  independent  of  each  other. 

377 


However,  it  seemed  possible  in  principle  that  the  magnitude  of 
the  point  biserial  correlation  could  vary  as  a  function  of  whether  the 
posttest  preceded  the  staining  performance  or  vice  versa.   Therefore, 
separate  point  biserial  correlations  were  computed  for  subjects  who 
had  had  each  of  those  two  orders. 

Posttest  precedes  performance.   In  this  case,  rpb=.01  and  was 

nonsignificant . 

Performance  precedes  posttest.   In  this  case,  r  ,  =.239  and  was 
* pb 

significant.   However,  even  though  .239  is  significantly  different  from 
zero,  it  seems  of  little  importance  in  view  of  the  fact  that  only  6% 
of  the  variance  in  the  posttest  scores  could  be  predicted  from  the 
performance  scores  and  vice  versa. 

In  sum,  the  nonsignificant  or  extremely  low  correlations  would 
seem  to  indicate  a  virtual  lack  of  relationship  between  the  two  variable; 
However,  this  must  be  viewed  cautiously,  as,  due  to  the  lack  of  variance 
in  both  the  posttest  and  the  performance  scores,  the  correlations  could 
not  be  high. 
Interviews  to  Determine  Crosstalk 

Approximately  10%  of  the  subjects  involved  in  the  study  were  inter- 
viewed to  determine  the  extent  to  which  crosstalk  might  have  occurred. 
Of  the  25  subjects  interviewed,  two,  or  8%,  did  talk  to  someone  who  had 
participated  in  the  experiment  before  they  themselves  had  participated 
in  it,  although  they  did  not  recall  the  nature  of  the  conversation. 
Of  the  same  25  subjects,  four,  or  16%,  did  talk  to  someone  who  had  parti- 
cipated in  the  experiment  after  they  themselves  had  participated  in  it. 
Again,  subjects  did  not  recall  the  nature  of  the  conversation.   These 
small  percentages  would  seem  to  indicate  that  crosstalk  was  not  a 
serious  factor  in  determining  the  results  of  this  study. 


378 


DISCUSSION 

As  mentioned  earlier,  the  results  of  the  four  analyses  of  variance 
testing  the  paper  and  pencil  posttest  scores  suggest  that  learning 
occurred  as  readily  with  compressed  material  as  with  uncompressed 
material.   This  supports,  again,  the  finding  of  numerous  other  studies 
that  compressed  material  often  is  as  effective  as  and  more  efficient 
than  uncompressed  material  for  the  learning  of  verbal  content.   It  should 
be  emphasized  here  that  only  verbal  content  is  being  referred  to.   In 
none  of  the  above-mentioned  studies  was  there  an  attempt  to  teach  a 
performance  (motor  chain)  by  means  of  compressed  speech. 

The  results  of  the  four  chi-square  analyses  testing  the  perfor- 
mance of  the  staining  procedures  results  indicate  that  when  both  the 
audio  and  the  video  components  were  compressed  percentage-wise  to  the 
same  degrees,  learning  occurred  as  readily  with  compressed  material  as 
with  uncompressed  material.   The  percentages  of  subjects  passing  who 
viewed  these  videotapes  with  compression  components  of  100%  audio  and 
100%  video,  66  2/3%  audio  and  66  2/3%  video,  and  50%  audio  and  66  2/3% 
video  were  62%,  78.5%  and  81.5%  respectively.   These  analyses  also 
indicated  that  the  videotape  with  compression  components  of  100%  video 
and  50%  audio,  and  the  videotape  with  compression  components  of  50% 
video  and  50%  audio  also  taught  equally  well.   However,  the  percentages 
performing  satisfactorily — 88%  with  the  former  and  81.5%  with  the  latter- 
show,  even  though  they  are  not  significantly  different,  a  direction 
similar  to  results  in  this  study  which  were  significant.   These  results 
show  that  the  videotape  with  compression  components  of  100%  video  and 
66  2/3%  audio,  was  significantly  better  than  the  videotape  with  compres- 
sion components  of  100%  video  and  100%  audio,  and  the  videotape  with 


379 


compression  components  of  66  2/3%  video  and  66  2/3%  audio.   We  may 
summarize  these  various  results  by  saying  that  subjects  learned  more 
from  videotapes  with  compressed  audio  and  uncompressed  video,  than  from 
videotapes  with  equal  compression  values  of  audio  and  video.   This  was 
an  unanticipated  result.   It  could  be  due  to  the  placement  of  the  com- 
pressed audio  on  the  uncompressed  video,  i.e.,  the  beginning  of  each 
step  on  the  audio  was  matched  with  the  beginning  of  each  step  on  the 
video,  thereby  resulting  in  much  of  the  performance  of  that  step  fol- 
lowing the  verbal  explanation  of  the  step.   In  other  words,  the  verbal 
explanation  of  the  step  was  mostly  given  before  showing  the  performance 
of  that  step,  thereby  leaving  the  subject  free  to  concentrate  on  the 
performance  itself.   This  apparent  result  could  have  implications  for 
videotaped  lessons  in  general,  for  the  result  is  concerned  not  with  the 
compression  variable  per  se  but  rather  with  the  temporal  relation  of  th] 
audio  and  video  components.   This  seems  especially  to  be  the  case  since 
the  videotape  with  compression  values  of  100%  video  and  50%  audio  re- 
sulted in  performances  that  were  more  often  satisfactory  than  perfor- 
mances following  the  videotape  with  compression  values  of  100%  video 
and  100%  audio. 

Finally,  it  will  be  recalled  that  100%  compression  is  equal  in 
the  present  study  to  104  WPM.   This  rather  slow  audio  rate  was  due  to 
the  nature  of  the  material,  which  we  may  characterize  broadly  as  being 
scientific.   The  present  results  suggest  that  when  there  is  a  video 
component  along  with  the  audio  component,  faster  word  rates  may 
actually  facilitate  learning. 
Suggestions  for  Further  Research 

It  would  be  of  interest  to  determine  whether  even  higher  word  rate; 
than  those  used  in  the  present  study  would  increase  further  the  amount 


380 


of  learning  of  the  performance  of  the  staining  procedures.   Further, 
the  generality  of  such  effects  might  be  profitably  determined  through 
instruction  in  the  performance  of  still  other  motor  skills,  both  in 
scientific  and  other  content  areas.   However,  it  would  seem  advisable 
that  a  pretest  of  the  performance  of  a  motor  skill  be  incorporated  in 
research  designs  so  that  one  has  a  measure  of  how  much  learning  occurred 
Further,  it  would  seem  advisable  to  measure  the  subjects'  performance 
at  each  step  in  the  procedure  in  order  to  have  perhaps  a  more  sensi- 
tive measure  of  what  has  been  learned. 

Another  suggestion  falls, both  inside  and  outside  of  the  realm  of 
compression.   Research  in  videotaping  is  indicated  for  the  process  of 
placing  the  verbal  explanation  of  a  step  in  a  motor  skill  before  showing 
the  performance  of  that  step.   Compression  of  speech  is  an  obvious 
choice  for  this  verbal  explanation,  especially  if  it  is  of  any 
significant  length. 


381 


Benz,  C.R.   Effects  of  time-compressed  speech  upon  the  comprehension 

of  a  visually  oriented  televised  lecture.   (Doctoral  dissertation, 
Wayne  State  University,  1971) .   Dissertation  Abstracts  Interna- 
tional, 1971,  32,  6579-A.   (University  Microfilms  No.  72-14523). 

Loper,  J.L.   An  experimental  study  of  some  effects  of  time  compression 
upon  the  comprehension  and  retention  of  a  visually  augmented  tele- 
vised speech.   (Doctoral  dissertation,  University  of  Southern 
California,  1967) .   Dissertation  Abstracts,  1967,  27,  437A. 
(University  Microfilms  No.  67-6505) . 

Wilson,  K.V.   A  distribution-free  test  of  analysis  of  variance  hypo- 
thesis.  Psychological  Bulletin,  1956,  5_3  (1)  ,  96-101. 


382 


Time -Expanded  Speech:     Clinical  Applications  to  the  Diagnosis  of 

Speech  Disorders 

by  Lass,    N.  J.  ,    Foulke,    E.  ,    Supler,    R.  A. 


THE  USE  OF  TIME- EXPANDED  SPEECH  AS  AN  AID  IN  THE 
DIAGNOSIS  OF  SPEECH  DISORDERS 

Norman  J.  Lass,   Emerson  Foulke,  and  Rebecca  A.   Supler 
ABSTRACT 
The  recordings  of  20  speech  defective  speakers'  readings  of  a  standard 
prose  passage  were  presented  to  36  student  speech  clinicians  under  three 
listening  conditions:     (1)  unaltered;  (2)  time -expanded  to  15070  of  the  original 
recording  time;  and  (3)  time- expanded  to  200 %  of  the  original  recording 
time.     Results  of  their  evaluations  indicate  that  the  expanded  versions  of 
the  recordings  elicited  a  larger  number  of  detected  errors  as  well  as  more 
accurate  evaluations  by  the  student  clinicians.     Implications  of  these  findings 
concerning  the  possible  usefulness  of  time-expanded  speech  as  an  aid  in  the 
training  of  listening  skills  in  student  speech  clinicians  are  discussed. 


383 


Norman  J.  Lass,  Ph.D. 
Speech  Pathology-Audiology 
West  Virginia  University 
Morgantov/n,  West  Virginia 


Emerson  Foulke ,  Ph.D. 
Perceptual  Alternatives  Laboratory 
University  of  Louisville 
Louisville,  Kentucky 


Rebecca  A.  Supler,  M.S. 
Speech  Pathology-Audiology 
West  Virginia  University 
Morgantown,  West  Virginia 


Paper  presented  at  the  Third  Louisville  Conference  on  Rate-Ccntrolled 
Speech,  November  3-5,  1975,  Louisville,  Kentucky,  and  to  be  published  in 
Journal  of  Speech  and  Hearing  Disorders,  1976. 


384 


INTKOUUC  'ION 

Since  the  development  of  special  equipment  which  allows  us  to  control 
the  rate  of  recorded  speech  without  affecting  the  pitch  of  the  speaker's 
voice,  the  emphasis  has  been  on  reducing  the  reproduction  time  of  the  re- 
cording. It  is  the  literature  on  time-compression  which  has  shown  continued 
growth  over  the  years,  while  time-expanded  speech  and  its  applications  has 
not  received  extensive  attention  from  investigators  of  either  applied  or 
basic  research. 


U'e  are  presently  investigating  the  usefulness  of  time-expanded  speech  in 
speech  pathology  training  programs.  Specifically,  we  are  attempting  to  deter- 
mine the  potential  value  of  time-expanded  speech  as  an  aid  in  the  training  of 
listening  skills  in  student  speech  clinicians,  especially  beginning  students. 
Frequently,  a  difficult  task  for  beginning  student  clinicians  in  speech  pa- 
thology is  that  of  developing  adequate  listening  skills  to  detect  a  speaker's 
specific  speech  disorders.  Such  ear  training  is  usually  developed  over  a  pe- 
riod of  several  years  and  is  obtained  through  the  various  clinical  experi- 
ences of  the  student  clinicians.  We  are  attempting  to  assess  the  usefulness 
of  time-expanded  speech  as  a  technique  for  accelerating  and/or  improving 
the  ear  training  process.  In  the  first  phase  of  this  project,  it  was 
important  to  determine  if  time-expanded  recordings  improved  students' 
judgments  of  articulation  disorders  when  compared  to  their  evaluations  made 


385 


with  unaltered  recordings.   Our  present  study  was  concerned  with  this 
issue. 

METHOD 
Speakers 

The  recordings  of  20  speech  defective  speakers'  readings  of  Fairbanks' 
(1960) "The  Rainbow  Passage'- were  made  ir;  a  sound-  treated  room  using  high- 
quality  recording  equipment.   All  speakers  were  adults  with  speech  disorders, 
and  all  but  three  speakers  were  judged  to  have  articulation  errors,  including 
omissions,  distortions,  and  substitutions  of  vowels  as  well  as   consonants. 
The  average  number  of  misarticulations  for  the  group  was  3.2,  and  the  range 
was  0-10  errors  per  speaker.   Some  speakers  manifested  other  kinds  of  speech 
problems  in  addition  to,  or  in  place  of,  articulation  errors,  including 
those  of  voice  quality,  resonance,  and  prosody. 
Listeners 

A  total  of  36  student  clinicians,  35  females  and  one  male,  participated 
in  the  study.  All  were  students 

in  the  Speech  Pathology-Audioiogy  program  at  West  Virginia  University  and 
had  completed  the  beginning  coursework  in  the  program,  including  a  survey 
course  in  speech  and  hearing  disorders,  a  course  in  applied  phonetics,  and 
a  course  in  voice  and  articulation  disorders.   The  average  number  of 
clinical  clock  hours  of  therapy  for  the  group  was  38.   All  student 
clinicians  ha^[  no  reported  hearing  difficulty  and  no  previous  exposure 
to  time-expanded  speech. 
Experimental  Procedures 

All  subjects  participated  in  a  total  of  three  listening  sessions.   In 
each  session,  they  were  asked  to  list  phonetically  all  specific  articulation 


386 


errors  that  they  perceived  in  each  of  the  recordings  of  the  20  speakers. 
In  each  listening  session,  the  master  tape  containing  the  speakers'  recordings 
was  presented  to  the  student  clinicians  in  one  of  three  conditions-  (1)  unalter- 
ed; (2)  time-expanded  to  150%  of  t;he  original  recording  time:  and  (3)  time- 
expanded  to  200%  of  the  original  recording  time.   The  order  of  presentation 
of  the  three  conditions  was  counterbalanced  so  that  six  orders  were  used, 
with  six  subjects  in  each  of  the  order  -roups. 

All  listening  conditions  were  presented  binaurally  to  the  subjects  by 
means  of  a  Lexicon  VARlSPEECIi  I  speech  compressor  and  matched  Sharpe  model 
IIA-10A  headphones.   All  listening  sessions  were  held  in  a  sound-treated  room. 
Criterion  Measures 

In  an  attempt  to  determine  the  degree  of  accuracy  of  the  listeners' 
judgments  under  the  three  listening  conditions,  all  recordings  were  judged 
by  three  professional  speech  pathologists,     members  of  the  faculty  of 
the  Speech  Pathology-Audiology  program  at  West  Virginia  University,  who 
listened  to  the  recordings  and  listed  phonetically  the  specific  articulation 
errors  manifested  by  .?ach  of  the  20  speakers .   Only  those  errors  which  were 
listed  by  all  three  speech  pathologists  were  employed  as  the  criterion 
measures  against  which  tc  compare  the  errors  listed  by  the  student  clinicians 
in  the  study. 

RESULTS 
The  results  of  the  student  clinicians '  evaluations  under  the  three 
listening  conditions  are  presented  in  Slide  1.   The  slide  indicates  that 
the  students  correctly  identified  the  largest  percentage  of  articulation 
errors  when  they  were  presented  with  the  master  tape  under  the  condition  of 
greatest  time  expansion  (200%) ,  the  next  largest  percentage  of  errors  under 


387 


Mean: 

S.D: 

Range: 


UNALTERED 

LISTENING  CONDITIONS 
150%  EXPANDED 

41.54 

200%  EXPANDED 

35.13 

44.19 

11.47 

10.42 

10.64 

9.23-64.62 

16.92-63.08 

23.08-66.15 

SLIDE   1 


388 


the  150%  time  expansion  condition,  anu  the  smallest  percentage  of  correctly 
identified  errors  under  the  normal,  unaltered  listening  condition. 

Slides  2  and  3  provide  information  or  subjects'  performance  for  each 
of  the  speakers  on  the  master  tape.   They  show  that  the  listeners  achieved 
the  highest  percentage  of  correctly  identified  articulation  errors  for  14 
of  the  20  speakers  under  che  200%  time  expansion  condition  and  for  two  of 
the  speakers  under  the  150%  time  expansion  condition.   Thus,  for  16  of  the 
20  speakers  in  the  study,  the  student  clinicians  correctly  identified  more 
errors  under  the  time  expansion  conditions  than  under  the  normal,  unaltered 
listening  condition. 

Inferential  statistical  analysis,  consisting  of  a  Friedman  two-way 
analysis  of  variance  by  ranks  (Siegel,  1956),  indicated  that  the  percentage 

of  correctly  identified  errors  differed  significantly  among  the  three 

2 
listening  conditions  in  the  study  (>'r  ==  14.88,  df  =  2,  p<.001)  .   Further- 
more, by  means  of  Wilcoxon  matched-pairs  signed-ranks  tests  (Siegel,  1956), 
it  was  found  that  the  percentage  of  correctly  identified  errors  in  the 
unaltered  condition  differed  significantly  from  those  reported  in  the  150% 
(z  =  -2.95,  p^.01)  and  200%  (z  =  -3.51 .  p<f.001)  time  expansion  conditions. 
However,  there  was  nc  statistically  significant  difference  between  the  150% 
and  200%  time  expansion  conditions  (z  =  -1.37). 

DISCUSSION 
The  findings  of  our  investigation  point  to  the  possible  usefulness 
of  time-expanded  speech  as  an  aid  in  the  development  of  listening  skills 
in  student  speech  clinicians.   Although  it  has  been  shown  that  time-expanded 
versions  of  recordings  can  elicit  more  accurate  articulation  evaluations 
from  student  clinicians,  it  is  still  necessary  to  determine  if  employing 


389 


70 

65 

—> 

8« 

60 

*— 

c/> 

55 

tr 

o 
cc 

50 

or 

UJ 

4b 

Q 

40 

LL 

3  5 

h- 

2 
UJ 

30 

Q 

25 

V 

-J 

20 

o 

CC 

15 

or 

o 

10 

o 

5 

0 

-I 


D  Normal 

0  150  %  Expanded 

■  200  %  Expanded 


1 


5  6 

SPEAKERS 


10 


SLIDE 


390 


D   Normal 

0  150  %  Expanded 

■  200  %  Expanded 


SPEAKERS 


SLIDE    3 


391 


this  technique  as  a  supplement  to  other  approaches,  or  as  a  direct  approach 
in  a  speech  pathology  training  program,  aids  in,  and /or  accelerates,  the 
development  of  listening  skills  in  student  speech  clinicians,  especially 
beginning  students.   That  is,  although  we  have  shown  that  time- expanded 
speech  aids  in  the  diagnosis  of  speech  disorders  by  students,  in  the "real 
world',1  during  "live'  diagnostic  sessions,  there  is  no  "device*  to  "slow  up  the 
speech  samples.   What  we  do  not  know  as  yet  is;  will  exposure  to  time- 
expanded  speech  over  a  given  period  of  time,  and  with  progressive  changes 
in  amount  of  expansion,  improve  one's  listening  skills  and  allow  for  a 
retention  of  these  skills  once  the  time-expansion  processing  has  been 
reduced  and  eliminated?  We  are  presently  pursuing  this  issue. 

Although  it  would  seem  like  a  logical  and  predictable  finding,  the 
fact  that  time-expanded  versions  of  the  recordings  employed  in  our  study 
elicited  more  accurate  judgments  from  listeners  than  the  unaltered  versions 
of  the  recordings  is  enlightening  and  provides  us  with  useful  information 
for  future  applications  of  time-expanded  speech.   By  the  nature  of  the 
process  involved,  the  time  alteration  of  recorded  speech  introduces  a 
certain  amount  of  "distortion'  into  the  reproduction  system,   the  greater 
the  amount  of  time  alteration,  the  greater  the  distortion.  What  was  not 
known  prior  to  this  investigation  vas  the  effect  of  this  distortion  on 
listeners'  judgments  of  articulation  erro::s,  especially  for  those  who  are 
not  trained  in  listening  to  time-altered  signals. 

It  should  be  remembered  tnat  the  judgments  made  in  the  present 
investigation  pertain  only  to  articulation  errors.   Thus,  the  findings 
cannot  automatically  be  generalized  to  all  kinds  of  speech  disorders. 
Perhaps  the  time  expansion  process  may  also  prove  useful  in  assessing  other 
parameters  of  speech,  including  voice  and  prosody.   It  is  suggested  that 

392 


future  researcx  fully  explore  this  issue  in  an  atteiart  to  discover  tae 
full  range  of  usefulness  of  the  time  expansion  process. 

The  low  scores  for  percentage  of  correctly  identified  errors  obtained 
in  this  investigation  is  not  surprising  :'.n  li^ht  of  the  limited  background 
and  training  of  the  subjects  employed  as  listeners:  they  were  student 
clinicians  who  lacked  the  neceesary  clinical  experience  to  make  very 
accurate  evaluations  of  the  recordings.   Furthermore ,  it  should  be  noted 
that  they  heard  each  of  the  20  readings  only  once  under  each  of  the  three 
experimental  conditions,  while  the  speech  pathologists  who  established 
the  errors  used  as  criterion  measures  in  this  investigation  were  allowed 
to  listen  repeatedly  to  each  reading  as  frequently  as  they  wished. 

It  would  be  interesting  to  determine  if  advanced  student  clinicians 
as  well  as  professional  speech  pathologists  show  differences  in  their 
judgments  of  articulation  errors  between  unaltered  and  time-expanded 
versions  of  recorded  speech  samples.   Perhaps  these  more  experienced 
listeners  would  benefit  more  from  the  time  expansion  process  and  thus 
show  greater  differences  in  correctly  identified  articulation  errors 
between  the  unaltered  and  time-expanded  conditions  than  the  student 
clinicians  in  the  present  study.  We  are  presently  investigating  this 
issue,  since  such  information  would  be  useful  for  future  applications 
of  time-expanded  speech  in  general  clinical  settings. 


393 


REFERENCES 

Duker,   S.  (Ed.),  Time- Compressed  Speech:  An  Anthology  and  Bibliography. 
Metuchen,  New  Jersey.  Scarecrow  Press,  1974. 

Fairbanks,  G.,  Voice  and  Articulation  Drillbook.   New  York:  Harper  and 
Row,  1960. 

Foulke ,  E .  (Ed . ) ,  Proceed inps  of  the  Louisville  Conference  on  Time  Compressed 
Speech.   Louisville:  University  of  Louisville,  1967- 

Foulke,  E.,  Methods  of  controlling  the  word  rate  of  recorded  speech.   Journal 
of  Communication,  20,  1970,  305-314. 

Foulke,  E.  (Ed.)  Proceedings  of  the  Second  Louisville  Conference  on  Rate 
and/or  Frequency-Controlled  Speech.  Louisville:   university  of  Louisville, 
1971. 

Foulke,  E.,  and  Sticht,  T . ,  Review  of  research  on  the  intelligibility  and 
comprehension  of  accelerated  speech.  Psychological  Bulletin,  72,  1969, 
50-62. 

Siegel,  S.,  Nonparametric  Statistics  for  the  Behavioral  Sciences.   New  York: 
McGraw-Hill,  1956,  pp.  75-83,  166-172. 


394 


SSLR:     Simultaneous  Speeded  Listening  and  Reading 

A  Promising  Path  to  Remediation  of  Reading  Disabilities 

by  Winters,    S.  N. 


395 


SPEEDED  LISTENING:    A  PROMISING  PATH  TO 
REMEDIATION  OF  READING  DISABILITIES 

Shirley  N.   Winters,   Ed.  D. 
ABSTRACT 
Many  sighted  youngsters  possessing  average  and  above  average  intel- 
ligence have  reading  disabilities  which  prevent  them  from  succeeding  in 
school  and  which  subsequently  alienate  them  from  the  mainstream  of  aca- 
demic life.     Remedial  reading  techniques  are  most  often  geared  to  the  visual 
modality,   the  same  one  in  effect,   which  has  probably  been  and  remains  the 
student's  weakest  area  of  preference.     The  aural  modality  remains  relatively 
neglected  in  this  area  despite  research  which  has  found  speeded  listening  a 
highly  efficient  alternative  to  reading  (Winters,    1973).     In  a  highly  literate 
society,   academic,   business,   and  professional  achievement  remains  keyed 
to  skillful  reading  and  efficient  listening  alone  will  not  often  lead  to  success. 

This  examiner  is  now  conducting  research  that  will  combine  simul- 
taneous speeded  listening  and  reading,   hypothesizing  that  it  will  improve 
the  reading  comprehension  of  children  with  reading  disabilities.     This  is 
based  on  her  own  applied  research  which  found  that  sighted  seventh  grade 
students  with  reading  problems  understood  significantly  more  of  what  they 
heard  with  speeded  listening  than  did  similar  students  who  read  the  same 
textbook-type  test  material,   and  a  general  conceptual  framework  of  rate 
and  thought  which  grows  from  the  research  of  leaders  in  listening  and  read- 
ing.    Nichols  found  that  rate  of  thought  ranges  from  400  to  800  wpm;  Rankin 
found  that  emphasizing  rate  improves  vocabulary,   comprehension,   and 
general  reading  proficiency  of  students;  Van  Voorhees  found  there  is  evi- 
dence that  rate  of  information  processing  is  a  significant  variable  of  the 
critical  reading  process. 

The  implications  of  this  research  are  that  speeded  listening  can  be 
used  to  improve  the  reading  comprehension  of  children  with  learning  disa- 
bilities and  to  individualize  instruction  in  heterogenously  grouped  class- 
rooms. 


Winters,    Shirley,    N.  ,   An  Investigation  of  Compressed  Speech  as  an 
Alternative  to  Reading,   Hofstra  University,    1973. 


396 


SSLRi   SIMULTANEOUS  SPEEDED  LISTENING  AND  READING 

A  PROMISING  PATH  TO  REMEDIATION  OF   READING  DISABILITIES 


Shirley  N.  Winters,  Ed.D.* 

Academic  success  is  generally  geared  to  proficient 
reading  ability  and  measured  by  standardized  tests.   However, 
it  is  important  to  point  out  that  such  tests  measure  the 
frustration,  or  lowest  reading  ability  of  a  student,  rather 
than  the  instructional  level,  or  that  point  at  which  he  can 
read  and  comprehend  with  some  guidance  fcom  a  teacher.   Thus, 
it  seems  clear  that  a  student  measuring  two  or  more  years  be- 
low grade  level  on  a  standardized  test,  can  neither  cope  with 
his  textbooks  nor  keep  up  witli  his  classwork.   In  addition, 
his  sense  of  failure  is  deepened  if  he  is  of  average  or  above 
average  intelligence  because  he  realized  that  he  is  capable  of 
more  than  he  is  achieving.   His  behavior  reflects  his  frustra- 
tion and  either  he  responds  by  becoming  a  disruptive  influence 
in  the  classroom  or  by  withdrawing,  shutting  himself  off  from  the 
failure  he  has  come  to  exoect  in  school.   Either  way,  his  at- 
tention is  diverted  from  school  efforts,  and  his  concentration 
dissipated . 

Learning  to  read  is  deemed  so  important  to  academic 
and  professional  success  that  it  has  been  declared  a  basic 
right  of  each  school  child.  (J.  Allen,  HEW).   Nevertheless  it 
is  clear  that  many  students  are  not  mastering  this  process. 
The  problem  that  precipitated  this  research  centered  on  the 
fact  that  an  increasing  proportion  of  students  entering  seventh 
grade  in  the  six  schools  of  a  central  high  school  district 
were  scoring  two  or  more  years  below  grade  level  on  a  standard- 
ized reading  test.   Remedial  reading  techniques  are  most  often 
geared  to  the  visual  modality,  the  same  ore  in  effect,  which  has 
probably  been  and  remains  the  student's  weakest  area  of  prefer- 
ence.  The  purpose  of  this  study  was  to  determine  whether 
speeded  listening,  in  which  word  rate  is  increased  without 
distortion,  was  not  only  an  alternative  to  reading,  but  in  fact 
an  efficient  remedial  reading  technique  when  simultaneously 
combined  with  reading. 

According  to  the  1970  Federal  Census,  this  suburban 
New  York  community  had  a  median  income  of  vpl3,522;  3.6  percent 
of  the  people  had  incomes  below  the  poverty  level;  5<3.2  percent 
worked  in  white  collar  occupations;  21  percent  in  manufacturing 


*Dr.  Shirley  N.  Winters  is  Coordinator  of  Reading  for  the 
Sewanhaka  (N.Y.)  Central  Hi.rh  School  District  and  an  adjunct 
professor  at  Hofstra  University,  N.Y.,  where  she  instructs 
graduate  students  in  methods  of  tea  ching  reading  at  the  elemen- 
tary and  secondary  levels. 

397 


occupations;  and  5^  percent  of -all  adults  were  graduates  of 
a  four  year  high  school.   Although  there  has  been  a  growing 
nucleus  of  one  parent  homes  caused  by  separation,  death,  or 
divorce,  most  families  remain  intact.   Farents  generally 
express  the  hope  that  their  children  will  attend  college. 
Often  both  parents  hold  jobs  outside  the  home.   The  subjects 
of  this  study  attended  one  of  the  six  junior-senior  high 
schools  in  the  district. 

Many  researchers  have  attempted  to  determine  whether, 
teaching  listening  skills  improves  reading  cor.prehension. 
Results  of  these  studies  have  been  contradictory,  but  there 
has  been  general  agreement  that  there  is  a  high  correlation 
between  reading  and  listening  comprehension  (Horn,  Stroud, 
Goldstein);  that  emphasizing  rate  improves  comprehension, 
vocabulary,  and  general  reading  proficiency  of  students  (Rankin); 
and  that  rate  of  information  processing  is  a  significant  variable 
of  the  critical  reading  process  (Van  Voorhees). 

However,  despite  these  positive  findings,  listening 
is  still  an  inefficient  method  of  learning,  bound  to  the  pace 
of  the  speaker,  which  is  approximately  125  to  150  wpm.   Its 
inefficiency  becomes  even  more  apparent  when  you  consider  that 
high  school  students  who  are  considered  only  good  readers  read 
at  about  250  wpm  and  Nichols  posited  that  the  rate  at  which 
the  brain  can  process  information  ranges  from  ^00  to  800  wpm. 
It  is  easy  to  see  why  the  listener's  attention  is  easily  divert- 
ed.  Speeded  listening;  provides  one  way  of  controlling  that 
loss  of  attention;  it  provides  the  element  of  general  arousal 
necessary  to  command,  attention.  (Deutsch  and  Deutsch.)   It 
also  makes  listening  without  loss  of  comprehension  possible 
at  speeds  comparable  to  efficient  reading  rates,  and  provides 
the  means  by  which  the  student's  existing  listening  vocabulary 
might  serve  to  improve  his  impoverished  reading  vocabulary. 

Orr  et  al  in  a  comprehensive  self-pacing  study  concluded 
that  time-comoressed  speech  was  a  potential  method  for  improv- 
ing the  processing   of  educational  information;  Enc  and  Stulurow 
found  that  faster  wpm  rates  of  265  to  350  v/ere  more  efficient 
for  learning  than  slower  rates  for  blind  seventh  and  eighth 
grade  boys  and  ^irls  of  average  and  above  average  intelligence. 
Goldhaber  found  that  junior  high  school  students  learned 
significantly  more  than  college  students  at  both  I65  and  330  wpm 
and  that  at  33°  wpm  subjects  recalled  about  75   percent  of  the 
factual  content  that  they  recalled  at  165  wpm. 


398 


This  investigator,  in  an  earlier  study, found  that 
seventh  grade  students  v/ith  reading  problems  of  average  and 
above  average  intelligence,  achieved  significantly  higher 
scores  on  comprehension  when  they  listened  at  250  wpm  than 
did. a  comparable  control  group  who  read  the  same  material. 
This  examiner  concluded  that  speeded  listening  was  a  viable  and 
efficient  alternative  to  reading  for  seventh  grade  students 
with  reading  problems.   At  the  same  time,  this  examiner  in- 
formally noted  that  the  major  difference  in  attitude  between 
the  reading  and  listening  groups  seemed  to  be  the  ability  to 
concentrate-.   The  reading  group  read  the  passages,  answered 
the  questions,  and  put  their  pencils  down,  rarely  going  back 
to  check  on  passages,  or  to  re-read  for  answers  although 
encouraged  to  do  so.   On  the  other  hand,  the  listening  group 
listened  very  carefully  before  writing  an  answer.   The  questions 
were  repeated  three  times  and  it  was  noted  that  many  students 
listened  all  three  times  before  answering.   This  examiner 
suggests  that  speeded  listening  was  the  variable  that  accounted 
for  the  greater  concentration  and  considered  the  possibility 
that  combined  speeded  listening  and  reading  might  result  in  ; 
greater  concentration  and  improved  reading  comprehension. 

In  this  study,  this  examiner  hypothesized  that  seventh 
grade  students  of  average  and  above  average  intelligence, 
reading  two  or  more  years  below  grade  level  on  a  standardized 
reading  test,  who  simultaneously  read  and  listened  at  250  wpm 
to  eight  taped  lessons  would:  (l)  score  significantly  higher 
on  a  standardized  reading  test  than  a  comparable  group 
reading  the  same  lessons;  (2)  pay  closer  attention  to  reading 
than  would  a  comparable  group  reading  the  same  lessons. 

PROCEDURES:  Seventy  seventh  grade  students  in  remedial 
reading  classes,  reading  two  to  five  years  below  grade  level 
on  the  Iletropolitan  Achievr.ent  Test,  Form  Am  were  divided  at 
random  into  two  groups:  a  reading  or  control  group  and  a 
reading/listening  or  experimental  group.   On  the  basis  of  the 
Metropolitan  scores,  each  student  was  classified  as  ^,  5»  or  6. 
These  numbers  represented  grade  levels.   All  students  were 
exposed  to  compressed  speech,  listening  at  250  wpm  to  passages 
from  Gates  Peardon  Reading:  Exercises,  Advanced  for  twenty 
minutes.   Compressed  speech  was  explained  to  the  students  and 
questions  about  it  answered  by  the  examiner.   Students  were 
told  that  those  who  would  be  in  the  reading  group  could 
alternate  with  those  in  the  reading/listening  group  once  the 
experiment  was  completed.   Before  the  experiment  started,  the 
examiner  had  recorded  and  compressed  24  cassette  tapes  on  the 
Hitachi  Varispeed  tape  recorder;  eight  lessons  on  each  grade 


399 


which  concentrated  on  the  skills  of  finding  the  main  idea 
and  facts  in  a  reading  passage. 

On  the  first  day  after  the  testing,  this  examiner  taught 
a  directed  skills  lesson  on  how  to  find  the  main  idea  and 
on  the  second  day  a  directed  skills  lesson  on  how  to  find 
the  details  in  a  reading  passage.   On  the  third  through  tenth 
days  the  students  either  read  or  simultaneously  listened  and 
read  eight  lessons  on  the  grade  level  to  which  he  had  been 
assigned.   The  lessons  were  corrected  and  reviewed  v/ith  stu- 
dents by  one  of  the  three  reading  teachers  or  the  educational 
aide.   Both  groups  were  told  they  could  go  back  into  the 
passages  to  confirm  or  find  an  answer.   Each  lesson  was  Con- 
ducted during  a  regular  ^3  minute  school  period.   The  three 
reading  teachers  and  one  aide  were  asked  to  observe  closely 
whether  students  actually  were  reading  while  listening,  whether 
their  attention  was  held  for  the  duration  of  each  tape,  and 
whether  the  reading  group  was  concentrating  on  reading  or  if 
their  attention  strayed.   3oth  groups  were  post  tested  v/ith 
the  I'letronolitan  Achievement  Test,  Form  3m.   A  comparison  of  the 
means  was  made  by  t  tests  and  the  level  of  confidence  was 
set  at  .05. 

PI HP I M 33:   The  results  of  a  t  test  performed  on  reading 
comprehension  pretests  for  the  two  groups  showed  no  statistic- 
ally significant  difference  between  the  two  gimps  at  the  start 
of  the  experiment.   The  results  of  a  series  of  t  tests  indica- 
ted that  at  the  end  of  the  experiment,  the  reading/listening 
or  experimental  group  did  have  significantly  higher  comprehension 
scores  at  the  .05  level  of  confidence  on- the  fourth  and  fifth 
grade  levels  than  did  comparable  reading  groups  who  read  the 
same  material.   The  sixth  grade  group,  although  included  in 
Table  1,  was  not  included  in  the  findingy  because  the  total  of 
six  students  in  both  groups,  control  and  experimental,  was  too 
small  a  number  after  attrition.   The  total  reading/listening 
group,  which  included  the  students  from  the  sixth  grade  group, 
achieved  significantly  higher  comprehension  scores  at  the  .01 
level  of  confidence  than  did  the  total  reading  group,  which 
also  included  the  three  students  from,  the  sixth  grade  group. 

Based  on  informal  teacher  observations,  some  students 
had  to  be  reminded  to  read  while  they  listened;  they  preferred 
to  listen  only.   Once  reminded  however,  their  attention  was 
held  to  the  reading  material,  except  for  one  boy  who  complained 
so  bitterly  that  he  was  removed  from  the  experiment  and  per= 
mitted  to  complete  the  lessons  in  the  reading  group.   The  students 


400 


who  were  reading  only  seemed  to  fall'  into  two  categories: 
they  were  either  attentive  with  little  or  no  encouragement 
and  remained  so,  or  were-  inattentive  and  rarely  responded 
to  encouragement.   The  students  in  the  reading/listening 
group  read  and  listened  intently  to  the  tapes  after  a  few 
early  reminders  in  the  first  day  or  two. 

COM CLUS IONS;   This  examiner  concluded  as  a  result  of 
these  findings  that  simultaneous  reading  and  listening  at 
250  v/pm  is  a  viable  and  efficient  remedial  reading  technique 
to  improve  reading  comprehension  and  that  it  provides  an 
arousal  to  attention  which  holds  the  concentration  of  youngsters 
severely  retarded  in  reading. 

IMPLICATIONS  1      One  of  the  implications  of  this  research  is 
that  students  who  are  severely  retarded  in  reading  but  who  have 
avdrage  intelligence  may  understand  content  material  of  a 
hieher  grade  level  than  the  one  they  actually  achieve  on  a 
standardized  reading  test,  if  they  can  simultaneously  listen 
and  read  at  a  speeded  rate.   As  an  experiment,  students  who 
had  achieved  anywhere  from  grades  2.8  to  3*9  on  the  I -etropo- 
litan  Achievement  Test,  Form  Am  were  assigned  to  the  fourth 
grade  reading  level  material.   The  rationale  behind  this  was 
both  pragmatic,  since  content  area  material  generally  starts 
at  fourth  grade  reading  level,  and  theoretical,  since  students 
are  said  to  have  better  listening  vocabularies  than  reading 
vocabularies.   Thus,  it  was  hypothesized  that  these  students 
could  understand  the  fourth  grade  material  at  least  minimally, 
when  it  was  read  to  them  on  the  tapes.   Of  the  six  students, 
four  did  in  fact  score  from  ^.2  to  5-1  grade  levels  on  the 
post  test;  two  made  no  progress.   It  is  impossible  to  general- 
ize from  this  small  number  of  subjects.   However,  this  type 
of  junior  high  school  student,  three  to  five  years  retarded 
in  reading,  of  average  intelligence,  is  so  resistant  to 
ordinary  remedial  reading  techniques  that  responses  such  as 
these  four  indicated  would  seem  to  warrant  further  research. 

Another  implication  of  this  study  is  that  the  integra- 
tion of  auditory-visual  modalities  at  a  rate  fast  enough 
to  engage  thought  processes,  improves  not  only  reading  comp- 
rehension but  memory.   In  the  study,  memory  or  at  least 
immediate  recall  was  factored  out.   Students  were  encouraged 
to  stop  the  machine,  go  back  into  the  passage,  or  replay  the 
tape  to  find  answers.   Yet,  when  these  same  students  took  a 


401 


standardized. reading  test  with  the  memory  factor  included, 
they  scored  significantly  better  on  the  post  test  than 
they  scored  on  the  pretest.   The  combination  of  both  modali- 
ties may  account  for  improvement  of  memory  as  evidenced  in 
this  study.   It  is  suggested  that  further  research  in  this 
area  might  prove  fruitful. 

Another  area  of  future  research  is  a  hypothesis 
that  seems  to  grow  out  of  this  study.   Does  simultaneous 
speeded  listening  and  reading  provide  cue  samples  of  a  word 
or  phrase  and  which  in  turn  improves  his  reading  comprehension? 
If  the  listener's  uncertainty  about  a  word  has  been  reduced 
by  the  context  in  which  it  is  presented,  that  word  does  not 
have  to  be  scrutinized  for  internal  characteristics.   It  is 
suggested  that  Que  sampling  and  the  reduction  of  uncertainty 
(Smith)  enables  the  reader/listener  to  make  associative 
connections  between  words  as  he  listens,  expectancies  of  what 
words  should  follow,  and  influences  the  speed  with  which  he 
recognizes  an  idea.   Some  cue  samples  may  stem  from  the 
reader/listener ' s  own  speaking  and  listening  vocabulary. 
Once  he  hears  the  word  or  phrase,  he  may  receive  sufficient 
sampling  to  help  him  recognize  whole  words  he  would  hot 
have  otherwise  recognized,  eliminating  the  laborious  initial 
consonant,  letter  by  letter  struggle. 

Although  listening  has  been  proven  an  efficient 
alternative  to  reading,  academic,  business,  and  professional 
achievement  remains  keyed  to  skillful  reading  in  our  highly 
literate  society.   It  is  the  despair  of  our  school  people 
that  many  youngsters,  possessing  average  and  even  above 
average  intelligence,  have  reading  disabilities  which  do  not 
respond  to  traditional  remediation  and  which  prevent  them 
from  succeeding  in  school.   Hopefully,  simultaneous  speeded 
listening  and  reading,  a  combination  of  the  visual  and  aural 
modalities,  may  provide  one  means  of  improving  reading 
comprehension,  enough  so  that  at  least  some  of  these  students 
have  a  chance  to  compete  successfully  in  the  mainstream  of 
academic  life. 


402 


•H  +> 
W  C 
C    g 

i-S 

•H  «H 


•H   3 


CH 

8 

•H  O 

& 

<&&.+> 

°fc 

JO 

St 

£ 

^ 

n 

«o 

«M   JO 

E 

P, 

o 

< 

3 

T3 

o 

O  i-t 

En 

2 

CO   I 


IH 


o  « 


Jo  o 
6" 


vO 
H 


jx 


§4, 
■P    £3 

SI 


I 


»  6 

O  cq 


■p 


2    5 
§    1 

O         « 


1  ^S 

0  bD-H 

•3  c  a 

<D  «D  4J         NO 

p«  «    W         r-l 


403 


REFERENCES 

1.  James  E.  Allen,  Jr.,  The  Right  to  Read-Target  for  the  70' s. 

U.S.  Department  of  Health,  Education,  and  Welfare, 
Office  of  Education,  September,  1969. 

2.  J.  A.  Deutsch  and  D.  Deutsch,  "Attention:  Some  Theoretical 

Considerations,"  Psychological  Review,  1963,  Vol.  70, 
No.  1,  pp. 80-90. 

3.  Mi tat  E.  Enc  and  Lawrence  M.  Stulurow,  "Comparison  of 

Effects  of  Two  Recording  Speeds  on  Learning  and  Retention," 
New  Outlook  for  the  Blind,  5^,  (February,  i960),  39-^8. 

k,      Gerald  M.  Goldhaber,  "Listerner  Comprehension  of  Compressed 
Speech  as  a  Function  of  the  Academic  Grade  Level  of  the 
Subjects,"  Journal  of  Communication.  -  20,  (June,  1970), 
167-73. 

5.  Harry  Goldstein,  "Reading  and  Listening  Comprehension  at 

Various  Controlled  Rates,"  Teachers  College,  Columbia 
University  Contributions  to  Education.  No.  321 ,  ( r< .  Y .  : 
Bureau  of  Publications,  Teachers  College,  19^0). 

6.  Ernest  Horn,  "Language  and  Meaning,"  Chapter  XI,  Fort?/- first 

Yearbook  of  the  National  Society  for  the  Study  of  Educa- 
tion, Part  II,  Blooming ton,  Illinois,  Public  School 
Publishing  Company,  (19^-2),  377-^13. 

7.  Ralph  G.  Nichols,  "Ten  Components  of  Effective  Listening," 

Education.  75.  (January,  1955).  292-302. 

8.  David  B.  Orr,  Herbert  L.  Friedman,  and  Cynthia  N.  Graae, 

"Self-Pacing  Behavior  in  the  Use  of  Time  Compressed 
Soeeco,"  Journal  of  Educational  Psychology,  60t     ( L>69, 
23-31. 

9.  Paul  T.  Rankin,  "Importance  of  Listening  Ability,"  The 

English  Journal.  XVII,  (October,  1923),  630. 

10.  Frank  Smith,  Understanding  Reading,  Holt,  Rinehart,  and 

.  Winston,  Inc.,  New  York,  1971.  191-200. 

11.  James  3.  Stroud,  Psychology  in  Education.  New  York, 

Longman's  Green  and  Company,  19^6,  ^43. 

12.  Sylvia  Van  Voorhees,  Rate  of  Information-Processing  as  a 

Variable  of  Critical  Reading,  Hofstra  University,  197^. 

13.  Shirley  N.  Winters,  An  Investigation  of  Compressed  Speech 

as  an  Alternative  to  Reading,  Hofstra  University,  1973. 


404 


7/22/2011 
T  235162   5  3  00 


